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Chapter  1 

Executive  Summary 


The  conjugate  gradient  (CG)  algorithm  is  a  popular  method  for  solving  a  system  of  linear  equa¬ 
tions.  It  is  computationally  efficient  and  guaranteed  to  converge  in  a  fixed  number  of  iterations. 
As  an  iterative  method,  CG  provides  a  series  of  approximations  to  the  solution  in  an  expanding 
Krylov  subspace.  In  this  project,  the  CG  algorithm  is  explored  for  reduced-rank  target  detec¬ 
tion  in  space-time  adaptive  processing  (STAP).  There  are  two  specific  goals.  One  is  to  develop 
reduced-rank  STAP  detectors  based  on  the  CG  algorithm  and  investigate  their  detection  perfor¬ 
mance.  The  other  is  to  use  to  the  CG  algorithm  to  build  a  relation  between  the  parametric  adaptive 
matched  filter  (PAMF)  and  reduced-rank  detection  and  develop  efficient  implementations  for  the 
PAMF  detector. 

Toward  the  development  of  CG-based  reduced-rank  detectors,  the  CG  algorithm  is  used  as  an 
efficient  solver  to  compute  the  weight  vector  of  the  matched  filter  (MF).  As  an  iterative  algorithm, 
it  produces  a  series  of  approximations  to  the  MF  weight  vector,  each  of  which  can  be  used  to  filter 
the  test  signal  and  form  a  test  statistic.  This  effectively  leads  to  a  family  of  detectors,  referred 
to  as  the  CG-MF  detectors,  which  are  indexed  by  k  the  number  of  iterations  incurred.  We  first 
consider  a  general  case  involving  an  arbitrary  covariance  matrix  of  the  disturbance  (including 
interference,  noise,  etc.)  and  show  that  all  CG-MF  detectors  attain  constant  false  alarm  rate 
(CFAR)  and,  furthermore,  are  optimum  in  the  sense  that  the  £;-th  CG-MF  detector  yields  the 
highest  output  signal-to-interference-and-noise  ratio  (SINR)  among  all  linear  detectors  within 
the  k- th  Krylov  subspace.  We  then  consider  a  structured  case  frequently  encountered  in  practice, 
where  the  covariance  matrix  of  the  disturbance  contains  a  low-rank  component  (rank-r)  due  to 
dominant  interference  sources,  a  scaled  identity  due  to  the  presence  of  a  white  noise,  and  a 
perturbation  component  containing  the  residual  interference.  We  show  that  the  (r  +  l)-st  CG-MF 
detector  achieves  CFAR  and  an  output  SINR  nearly  identical  to  that  of  the  MF  detector  which 
requires  complete  iterations  of  the  CG  algorithm  till  reaching  convergence.  Hence,  the  (r  +  l)-st 
CG-MF  detector  can  be  used  in  place  of  the  MF  detector  for  significant  computational  saving 
when  r  is  small.  Numerical  results  are  presented  to  verify  the  accuracy  of  our  analysis  for  the 
CG-MF  detectors. 

Originally,  the  PAMF  detector  was  introduced  by  using  a  multichannel  autoregressive  (AR) 
parametric  model  for  the  disturbance  signal  in  STAP  detection.  While  the  parametric  approach 
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brings  in  benefits  such  as  reduced  training  and  computational  requirements  as  compared  with 
fully  adaptive  STAP  detectors,  the  PAMF  detector  as  a  reduced-rank  solution  remains  unclear. 
Toward  the  development  of  a  reduced-rank  detection  framework  for  the  PAMF  detector,  the  CG 
algorithm  to  solve  the  linear  prediction  problem  arising  in  the  PAMF  detector.  It  is  shown  that 
CG  yields  not  only  a  new  computationally  efficient  implementation  of  the  PAMF  detector,  a 
new  and  efficient  AR  model  order  selection  method  that  can  naturally  be  integrated  with  CG 
iterations,  but  it  also  offers  new  perspectives  of  PAMF  as  a  reduced-rank  subspace  detector.  We 
first  consider  the  integration  of  the  CG  algorithm  with  the  MF  and  parametric  matched  filter 
(PMF)  when  the  covariance  matrix  of  the  disturbance  signal  is  known.  It  is  then  extended  to  the 
adaptive  case  where  the  covariance  matrix  is  estimated  from  training  data.  Important  issues  such 
as  computational  complexity  and  convergence  rate  are  discussed.  Performance  of  the  proposed 
CG-PAMF  detector  is  examined  by  using  the  KASSPER  and  other  computer  generated  data. 
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Chapter  2 

Conjugate  Gradient  Matched  Filter 


2.1  Introduction 

This  project  is  concerned  with  the  multichannel  signal  detection  problem  from  temporally  and 
spatially  correlated  clutter  and/or  jammer  background,  which  is  found  in  phased-array  radar, 
sonar  and  many  other  applications.  A  widely  explored  technique  for  multichannel  signal  detec¬ 
tion  is  space-time  adaptive  processing  (STAP)  [1],  first  proposed  by  Brennan,  Reed  and  Mal- 
lett  [2].  Most  STAP-based  methods,  such  as  the  adaptive  matched  filter  (AMF)  [3]  and  Kelly’s 
generalized  likelihood  ratio  test  (GLRT)  [4],  need  to  invert  a  large  space-time  covariance  matrix, 
thus  requiring  a  substantial  amount  of  secondary  or  training  signals  as  well  as  a  high  compu¬ 
tational  cost.  Aimed  at  mitigating  the  demanding  training  and  computational  requirements  of 
the  full-dimensional  STAP  methods,  reduced-rank  STAP  techniques,  such  as  eigencanceler  [5], 
principal-component  method  [6],  cross-spectral  metric  [7],  multistage  Wiener  filter  (MWF)  [8], 
etc.,  have  been  proposed  to  reduce  the  dimension  of  the  data  in  advance  of  detection.  Such  re¬ 
duced  dimensional  techniques  have  been  extensively  studied;  see,  e.g.,  [9]  and  references  therein. 

The  conjugate  gradient  (CG)  algorithm  (e.g,.  [10])  is  a  popular  method  for  solving  a  system 
of  linear  equations.  It  is  computationally  efficient  and  guaranteed  to  converge  in  a  fixed  number 
of  iterations.  As  an  iterative  method,  CG  provides  a  series  of  approximations  to  the  solution  in  an 
expanding  Krylov  subspace  [10].  The  CG  algorithm  has  been  explored  for  STAP  detection  and 
beamforming  in  a  multitude  of  recent  studies.  Connections  between  the  CG  algorithm  and  the 
MWF  for  STAP  detection  are  explored  in  [11,12],  and  an  equivalence  is  found  between  the  CG 
and  the  multistage  Wiener  filters.  For  beamforming,  a  Krylov  subspace  adaptive  beamformer  is 
introduced  to  improve  the  robustness  of  model  order  determined  in  [13]  and  separate  the  desired 
signal  from  interference  in  [14].  In  [15],  the  CG  algorithm  is  employed  to  solve  a  linear  prediction 
problem  in  the  parametric  adaptive  matched  filter  (PAMF)  [16, 17],  which  also  shows  that  the 
PAMF  detector  is  a  member  of  the  reduced-dimensional  STAP  family. 

In  this  chapter,  we  explore  the  CG  algorithm  in  the  matched  filter  (MF)  for  STAP  detection. 
The  CG  algorithm  is  employed  to  iteratively  calculate  the  weight  vector  of  the  MF  detector, 
which  produces  a  series  of  progressively  improved  approximations  to  the  MF  weight  vector.  As 
we  shall  show,  each  of  the  intermediate  weight  vectors  generated  by  CG  iterations  can  be  used  to 
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form  a  decision  variable,  which,  collectively,  forms  a  family  of  STAP  detectors  referred  to  herein 
as  the  CG-MF  detectors.  Our  goal  is  to  examine  the  performance  of  these  CG-MF  detectors 
relative  to  the  benchmark  MF  detector.  The  motivation  is  that  if  any  detector  within  the  CG-MF 
family  offers  a  performance  close  to  that  of  the  MF  detector,  then  the  latter  can  be  substituted 
by  the  former,  especially  if  the  CG-MF  detector  is  obtained  using  a  few  iterations  and,  as  such, 
is  computational  more  efficient.  We  consider  two  cases,  one  involving  a  general  covariance  for 
the  disturbance  (i.e.,  clutter,  jamming,  and  noise)  and  the  other  involving  a  structured  disturbance 
covariance. 

Specifically,  for  the  first  case,  the  space-time  covariance  matrix  of  the  disturbance  is  assumed 
positive  definite  but  otherwise  arbitrary.  The  probability  of  false  alarm  and  probability  of  detec¬ 
tion  of  the  CG-MF  detectors  are  derived  in  closed  form.  The  analysis  shows  that  the  CG-MF 
detectors  achieve  constant  false  alarm  rate  (CFAR),  irrespective  of  the  number  of  iterations.  The 
probability  of  detection,  for  a  given  false  alarm  probability,  is  dictated  by  the  output  signal-to- 
interference-and-noise  ratio  (SINR)  of  the  linear  filter  employed  by  the  CG-MF  detector,  which 
changes  over  the  CG  iterations.  It  is  found  that  the  CG-MF  detector  obtained  at  the  k-\h  CG  iter¬ 
ation  is  optimum  in  the  sense  that  it  yields  the  largest  output  SINR  over  all  linear  detectors  within 
the  /^-dimensional  Krylov  subspace.  As  a  result,  the  output  SINR  and  detection  probability  of 
the  CG-MF  detector  improves  with  more  CG  iterations  and  converges  to  their  counterparts  of  the 
MF  detector. 

For  the  second  case,  the  disturbance  space-time  covariance  matrix  is  assumed  to  include  a 
low -rank  component  Ft,,  a  scaled  identity  due  to  the  presence  of  a  white  noise  in  the  disturbance, 
and  a  perturbation  component  A  composed  of  entries  that  are  generally  small  compared  with 
those  in  Ft,.  The  above  structure  is  frequently  encountered  in  many  applications,  whereby  Ft, 
contains  the  dominant  clutter  scatterers  or  interference  sources  that  need  be  effectively  mitigated, 
whereas  A  may  include  insignificant,  residual  clutter/interference  sources.  In  the  absence  of  the 
perturbation,  it  is  well-known  that  the  CG  algorithm  converges  in  r  +  1  iterations,  where  r  is  the 
rank  of  Ft,  [10],  and  if  r  is  small,  requires  significant  less  computation  than  directly  computing  the 
matrix  inverse  needed  by  the  MF  detector.  However,  with  the  presence  of  A,  the  CG  algorithm 
in  general  requires  full  iterations,  i.e.,  JN  iterations  with  J  being  the  number  of  channels  and 
N  the  number  of  temporal  samples,  to  reach  convergence.  Despite  this,  for  the  obvious  reason 
of  complexity  reduction,  there  is  an  interest  to  investigate  if  we  can  abort  the  CG  algorithm  after 
r  +  1  iterations,  and  use  the  (r  +  l)-st  CG-MF  detector  within  the  CG-MF  family  in  place  of 
the  MF  detector,  provided  that  the  perturbation  is  small.  To  answer  the  question,  an  analysis  is 
provided  which  reveals  a  relation  between  the  weight  vectors  of  the  (r  +  l)-st  CG-MF  detector 
and,  respectively,  the  MF.  The  result  is  next  used  to  compute  and  compare  the  output  SINRs  of 
the  two  detectors.  Interestingly,  it  is  found  that  the  output  SINRs  are  identical  within  a  first-order 
approximation,  that  is,  the  output  SINR  loss  experienced  by  the  CG-MF  detector  relative  to  the 
MF  is  a  higher  order  term  of  the  perturbation  which  can  be  neglected  for  small  perturbation.  As 
such,  the  detection  probabilities  are  also  nearly  identical,  and  be  computed  by  using  the  results 
obtained  for  the  first  (general  covariance  matrix)  case. 

The  remainder  of  this  chapter  is  organized  as  follows.  The  signal  detection  problem  is  intro¬ 
duced  in  Section  2.2  and  the  CG-MF  detector  described  in  Section  2.3.  The  performance  analysis 
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of  the  CG-MF  detectors  is  presented  in  Section  2.4,  including  the  general  case  of  disturbance  co- 
variance  matrix  in  Section  2.4.1  and  the  structured  case  in  Section  2.4.2.  Our  analytical  results  are 
verified  by  computer  simulations  in  Section  2.5.  Finally,  conclusions  are  summarized  in  Section 
3.7. 

Vectors  and  matrices  are  denoted  by  boldface  lower-case  and  upper-case  letters,  respectively. 
Transpose,  complex  conjugate  and  complex  conjugate  transpose  are  respectively  represented  by 
(•)T,  (•)*  and  (-)H.  C  and  M  denote  the  complex  and  real  number  fields.  CMiji.  R )  denotes  a 
multivariate  Gaussian  random  variable  with  mean  /i  and  covariance  matrix  R. 


2.2  Data  Model 


Consider  a  ./-channel  sequence  x(n  )  G  CJxl,  n  —  1,  2,  •  •  •  ,  N,  received  using  an  array  antenna 
which  is  corrupted  by  a  space-time  correlated  disturbance  random  process  c(n).  The  detection 
problem  involves  the  following  binary  hypotheses: 


H0  :  x(n )  =  c(n) 

Hi  :  x(n )  =  as(n )  +  c(n) 


(2.1) 


where  s(n )  is  a  known  J-channel  signal  and  a  is  its  unknown  complex  amplitude.  For  the  conve¬ 
nience  of  later  discussions,  define  the  following  JN  x  1  vectors:  s  =  [sr(l),  sT(2),  •  •  •  ,  sT(N)]T, 
c  =  [cT(l),  ct(2),  •  •  •  ,  ct(N)]t,  and  x  =  [ccT(l),  xT(2),  •  •  •  ,  xT(N)]T.  In  STAP,  s  is  known 
as  the  space-time  steering  vector.  For  a  side-looking  uniform  linear  array  (ULA),  s  is  given  by 


s  =  st0ss  (2.2) 

where  st  =  (1/a /N)  [l,  e*27r^d, ,  •  •  •  ,  el27r(Ar-1)/d]T  is  the  temporal  steering  vector  with  a  normal¬ 
ized  Doppler  frequency  fd,ss  =  (1/ \fj)  [l,  e*2?r^ , ,  •  •  •  ,  e*27r(J-i)/sj T  [s  the  spatial  steering  vector 
with  a  normalized  spatial  frequency  fs,  and  (8)  denotes  the  Kronecker  product.  For  STAP  detec¬ 
tion,  a  standard  assumption  is  that  the  disturbance  c  is  a  Gaussian  random  vector  with  zero-mean 
and  space-time  covariance  matrix  R  e  CJArxJJV  [  ]  ].  it  follows  that  x  ~  CJ\f (as,  R),  where 
a  =  0  under  H0  and  a/0  under  //, . 


2.3  Conjugate  Gradient  Matched  Filter 


The  optimum  detector  for  (2.1)  is  the  matched  filter  (MF)  (e.g.,  [3]): 


^MF  ~ 


sH  R~1x\2  H! 

8hR~18  Ho1 


(2.3) 


where  r]  is  the  threshold  of  MF.  The  performance  of  the  MF  is  regarded  as  a  benchmark  for  all 
linear  tests.  For  notational  convenience,  we  will  frequently  denote  a  detector  by  a  linear  filter 
weight  vector.  For  the  MF  detector,  the  weight  vector  is 


1UMF  =  R  1S- 


(2.4) 
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Then  the  MF  test  statistic  can  be  alternatively  expressed  as: 


^MF  — 


W^i:Rwm:  Ho 


(2.5) 


For  typical  STAP  applications,  the  covariance  matrix  R  has  a  large  dimension.  As  a  result, 
direct  matrix  inversion  is  usually  not  recommended  to  compute  the  weight  vector  (2.4)  due  to  its 
computational  complexity.  We  consider  herein  an  alternative  way  by  employing  the  conjugate 
(CG)  algorithm  [10],  which  iteratively  finds  a  sequence  of  linear  weight  vectors  wk,  k  =  0, 1, ... , 
that  are  guaranteed  to  converge  to  the  MF  weight  vector  in  no  more  than  JN  iterations.  Each 
of  the  weight  vector  wk  can  be  used  to  form  a  detector  as  in  (2.5).  As  such,  the  CG  iterations 
yield  a  family  of  detectors,  referred  to  as  the  CG-MF  detectors.  The  CG-MF  detectors  are  closely 
related  to  the  so-called  multistage  Wiener  filter  [11].  We  examine  the  performance  of  these  CG- 
MF  detectors  relative  to  the  MF  detector.  To  introduce  necessary  notation,  The  CG  algorithm  is 
briefly  summarized  as  follows. 

Initialization.  Initialize  the  conjugate  direction  vector  di,  gradient  vector  7,  and  initial  solu¬ 
tion  w0: 


d  1  =  — 7i  =  s 
w0  =  0. 


for  k  —  1,  2,  •  •  • ,  till  convergence  ( k  <  JN)  do 
Update  the  step  size  ak: 

ll7j2 

k  dk  Rdk 

Update  the  solution  wk: 

to k  wk—\  T  akdk. 


Update  the  gradient  vector  7fc+1: 


7fc+i  =  Ik  +  (XkRdk . 
Update  the  conjugate  direction  vector  dk+ 1- 


dk+i  d} 
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fc+ii 


117* 


7fc+i- 


(2.6) 

(2.7) 

(2.8) 

(2.9) 

(2.10) 

(2.11) 


end  for 

A  quick  comment  on  the  complexity  is  in  order.  Each  iteration  of  the  CG  algorithm  involves 
one  matrix-vector  product,  requiring  about  (){{J  N]1)  flops.  With  full  JN  iterations,  the  CG 
algorithm  has  a  complexity  of  0((JN )3)  flops,  comparable  with  alternative  linear  solvers  such 
as  the  QR  factorization  [10].  In  many  practical  cases,  the  CG  algorithm  may  require  far  fewer 
than  full  iterations  (see  Section  2.4.2  and  also  [10,  Chap.  10]  for  discussions  on  the  convergence 
of  the  CG  algorithm),  leading  to  significant  reduction  in  complexity. 
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2.4  Analysis 

In  this  section,  we  consider  the  performance  of  the  CG-MF  detectors  under  two  cases.  The  first 
involves  a  general  covariance  matrix  R  that  is  positive  definite  but  otherwise  arbitrary,  whereas 
the  other  deals  with  a  structured  covariance  that  is  frequently  encountered  in  practice. 

2.4.1  General  Covariance  matrix 

It  would  be  useful  to  represent  the  CG-MF  detector  wk  by  using  the  conjugate  direction  vectors 
{dk}  in  closed  form.  First,  (2.9)  implies  that 


wk  =  DkOt-k 


(2.12) 


where  a*,  =  [an,  0:2,  •  •  ■«  ,  ak}T  contains  the  stepsizes  and  Dk  =  [di,  d2,  •  •  •  ,  dk\  consists  of  the 
first  k  conjugate  direction  directors.  One  property  of  Dk  is  that  it  diagonalizes  the  covariance 
matrix  R  [10,  p.  523]: 

DkRDk  =  Ak  (2.13) 

where  \k  =  diag ■  ■  ■  ,u\)  and  uk  =  (d1^ Rdk)  2.  This  allows  ctk  to  be  compactly  ex¬ 
pressed  as 

ak  =  A.?D*8  (2.14) 

which  gives  the  following  close-form  expression  for  wk; 

wk  =  DkK^D^s.  (2.15) 


The  /c-th  CG-MF  detector  using  wk  is  given  by 


\w?x\2  H 1 

H~D  U 

W%RwkH0 


(2.16) 


Theorem  1  For  the  k-th  CG-MF  detector  wk, 


(a)  It  is  a  linear  minimum  mean  square  estimator  that  minimizes 

wk  =  are  min  \E\\a  —  wHx\\2}  (2.17) 

w£K(R,s,k) 

among  all  linear  estimators  within  the  k-dimensional  Krylov  subspace 
JC(R,  s,  k )  =  span{s,  Rs,  R2s,  •  •  ■  ,  Rk~1s}. 


(b)  wk  yields  the  largest  output  SINR 


Pk 


2 

w^Rwk 


among  all  linear  detectors  within  IC(R,  s,  k). 


(2.18) 
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(c)  Under  the  aforementioned  Gaussian  assumption,  namely  x  ~  CAf(as.R)  with  a  =  0 
under  H0  and  a  f  0  under  Hi,  the  probability  of  false  alarm  Pfa±  and  the  probability  of 
detection  Ifj-  of  w k  are 


Pfajz  =  (-Tfc) 

(2.19) 

P i,k  A J 2^7 k ) 

(2.20) 

where  (){■)  is  the  Marcum  Q  function. 

Sketch  of  proof:  As  the  above  results  are  quick  extensions  from  standard  knowledge,  we  only 
provide  a  sketch  of  proof.  Result  (a)  is  proved  by  observing  that  the  Krylov  subspace  is  also 
spanned  by  the  conjugate  direction  vectors  [10]:  fC(R,  s,  k )  =  span{d1?  d2,  ■  ■  ■  ,  dk}.  As  such, 

wk  =arg  min  {E||a  —  wHx\\2} 

w=Dka  (2  21) 

=Dk  (D«RDky'  D«s  =  DkApD^s 

which  is  identical  to  (2.15). 

Result  (b)  follows  immediately  from  (a),  since  minimizing  the  mean-square  error  (MSE)  is 
equivalent  to  maximizing  the  SINR  [18]. 

To  show  (c),  we  use  (2.13)  and  (2.12)  to  rewrite  the  test  variable  tk  of  (2.16)  as 

\rvH  r)HT-\2 

t, k  =  Akcx.k  =  \yk\2  (2.22) 

where  yk  =  ( a D  j1  x )  ( a [l  Aka k )  2 .  it  is  clear  that  yk  ~  CAf(0, 1)  under  If,  and,  respectively, 
yk  ~  CM  ((acxkDks)(cxkAkcxk)-fl^  under  If .  As  such,  tk  is  central  and,  respectively, 
noncentral  Chi-square  distributed  under  H0  and  If,  from  which  (2.19)  and  (2.20)  follow  imme¬ 
diately.  Q.E.D. 

A  number  of  remarks  are  in  order.  First,  (2.19)  implies  that  CG-MF  detectors  for  all  k  are 
CFAR  detectors.  Their  test  variables  tk  are  all  identically  distributed  to  that  of  the  MF  test 
variable  tMF,  irrespective  of  k  the  number  of  iterations.  Second,  (b)  implies  that  pk  <  pk+i,  since 
/C(_R,  s,  k)  C  JC(R,  s,k  +  1).  Hence,  the  CG-MF  is  a  family  of  CFAR  detectors  wk  composed 
of  both  reduced-rank  detectors  ( k  <  JN )  and  the  full-rank  MF  detector  ( k  =  JN),  which 
offers  a  natural  way  to  trade  complexity  for  performance.  Specifically,  the  detection  probability 
If  k  of  the  CG-MF  detector  wk  increases  with  more  CG  iterations  (i.e.,  a  larger  k ),  at  higher 
computational  complexity.  The  trade-off  and  the  analytical  expression  (2.20)  allow  one  to  save 
the  computational  cost  by  selecting  an  appropriate  reduced-rank  CG-MF  detector  that  offers  a 
targeted  PA  k,  without  going  through  all  CG  iterations. 

2.4.2  CG-MF :  Structured  Covariance  Matrix 

One  desired  property  of  the  CG  algorithm  is  its  fast  convergence.  In  general,  it  takes  no  more 
than  JN  iterations  to  obtain  the  MF  detector  vector  iuMf  (2.4)  [10].  Even  faster  convergence 
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is  possible  if  the  covariance  matrix  of  the  disturbance  has  some  specific  structure,  which  makes 
the  CG  algorithm  more  attractive  than  many  alternative  approaches  to  computing  (2.4).  In  this 
section,  we  examine  the  performance  of  the  CG-MF  detectors  in  the  case  where  the  disturbance 
covariance  matrix  has  some  low-rank  structure.  Specifically,  suppose  R  consists  of  the  following 
two  components: 

R i  +  o2J  (2.23) 

where  R,  is  a  rank-r  positive  semi-definite  matrix  (r  <  JN)  and  I  an  identify  matrix.  For  such  a 
structured  matrix,  it  is  known  that  the  CG  algorithm  converges  to  the  baseline  MF  solution  using 
at  most  r  +  1  iterations  [10],  i.e.,  wr+ 1  =  mMF,  which  is  computationally  very  efficient. 

Many  practical  applications  involve  a  disturbance  covariance  matrix  having  a  structure  similar 
to  (2.23).  For  example,  in  airborne  radar  applications,  the  disturbance  covariance  matrix  often 
consists  of  two  components,  namely  a  low-rank  R,  due  to  the  clutter  and  jamming  and  a  scaled 
identity  o'll  due  to  the  thermal  noise,  where  o\  denotes  the  noise  variance.  The  rank  r  is  typically 
much  smaller  than  the  joint  spatio-temporal  dimension  JN.  Specifically,  if  the  disturbance  is 
primarily  due  to  ground  clutter  and  thermal  noise,  then  according  to  Brennan’s  rule  [2],  the  rank 
of  the  clutter  covariance  matrix  for  the  full-dimensional  MF  is  approximately 

r  «  \J  +  (N-  1)0]  (2.24) 

where  (3  =  2 vgTr/d,  vg  is  the  platform  velocity,  Tr  is  the  pulse  repetition  period,  d  is  the  antenna 
element  spacing,  and  [ •]  rounds  a  real-valued  number  towards  infinity. 

Estimating  the  rank  r  can  be  a  tricky  issue  since  (2.24)  may  not  hold  for  all  clutter  scenarios 
encountered  in  practice.  The  CG  algorithm  has  an  advantage  of  not  requiring  to  know  r  a  priori, 
since  at  the  (r  +  l)-st  iteration,  the  residual  s  —  Rwk,  which  is  also  the  negative  gradient 
vanishes.  This  is  the  stopping  rule  used  by  the  CG  algorithm  to  stop  iterations  [10].  Other  STAP 
detectors  designed  to  take  advantage  of  the  structure  (2.23),  such  as  the  low  rank  normalized 
matched  filter  (LRNMF)  [19]  which  employs  the  principal  eigenvectors  of  the  covariance  matrix, 
requires  an  estimate  of  r  and  its  performance  is  quite  sensitive  to  the  accuracy  of  the  estimate. 

While  the  convergence  of  the  CG  for  a  structured  covariance  matrix  exactly  like  (2.23)  is 
well  known,  in  the  following,  we  consider  a  different  but  related  scenario  that  the  disturbance 
covariance  matrix  R  is  a  perturbed  version  of  (2.23): 

R  =  Ri  +  oil  +  A  4  R0  +  A  (2.25) 

where  Ro  —  Ri  +  oil  is  structured  as  in  (2.23)  and  A  is  a  Hermitian  perturbation  matrix  that 
is  assumed  to  be  small,  i.e.,  ||A||  <C  ||J?o||-  Since  the  perturbation  is  small,  it  is  of  interest 
to  examine  the  following  important  questions:  Can  we  benefit  from  the  CG  algorithm  when  the 
disturbance  covariance  matrix  is  given  as  (2.25)?  Can  it  reach  convergence  or  almost  convergence 
in  r  +  1  iterations?  How  is  the  detection  performance  of  the  CG-MF  detector  wr+]  compared 
to  the  MF  detector?  A  second  type  of  perturbation  is  stochastic  perturbation  arising  from  finite 
training  data  effects  in  estimating  the  covariance  matrix.  The  random  perturbation  issue  is  a 
subject  of  ongoing  research  and  will  be  reported  in  a  future  publication. 

Before  we  address  the  above  questions,  some  discussions  on  the  model  (2.25)  are  in  order. 
The  perturbation  model  may  exist  in  many  scenarios.  For  example,  in  airborne  radar  applications, 
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the  covariance  matrix  R  rarely  has  exactly  r  +  1  distinct  eigenvalues  as  in  (2.23).  Typically,  R 
contains  a  few  principal  eigenvalues  due  to  the  dominant  clutter  scatterers,  but  the  other  eigenval¬ 
ues  are  rarely  identical  and  spread  around  the  noise  level  [1].  By  decomposing  R  as  (2.25),  R, 
contain  only  the  dominant  clutter  scatterers,  and  the  effect  of  the  less  significant  clutter  scatterers 
can  be  included  in  A.  The  same  can  be  extended  to  a  general  interference  scenario,  where  R,  is 
the  covariance  matrix  of  only  a  few  major  interference  sources  that  have  to  be  mitigated  at  the 
receiver,  whereas  A  contains  the  residual  interference. 

Given  a  structured  covariance  matrix  R  with  perturbation  as  in  (2.25),  the  analytical  results 
of  Section  2.4.1  of  the  family  of  CG-MF  detectors  are  still  applicable.  To  address  the  earlier 
questions,  we  consider  the  CG-MF  detector  wr+ 1  obtained  at  the  (r  +  l)-st  CG  iteration,  and 
examine  the  performance  of  this  CG-MF  detector  relative  to  the  MF  detector  iuMf-  We  first 
present  a  result  that  relates  the  weight  vectors  for  the  two  detectors. 

Lemma  1  Consider  the  linear  equation  RwMF  =  s,  where  R  —  R  +  o2l  +  A  =  R0  +  A 
is  a  positive-definite  Hermitian  matrix,  R,  is  a  rank-r  positive  semi-definite  Hermitian  matrix, 
o2  >  0  is  a  constant,  and  A  is  a  Hermitian  perturbation  matrix.  If  the  perturbation  is  small  such 
that  ||  A ||  <C  1 1 Ro 1 1 ,  the  MF  solution  wMf  can  be  approximated  by  the  CG-MF  solution  wr+ \, 
with  the  approximation  error  given  by 


wMF  -  wr+ 1  =  R0  2PfrR0  3d  +  o(||A||)  (2.26) 


where  o(||A||)  contains  the  second-  and  higher-order  perturbation  terms  that  can  be  neglected 
for  small  ||  A||, 


rp  7?  2  T1 

j.  r  -ri'0  t 

(2.27) 

Tr  =  [s,  R0s,  R20s ,  ■■■  ,  R'0s] 

(2.28) 

pi.  _  j-  rj~t  p  j  —lrjn11 

(2.29) 

and 

d  =  (AR0 1  +  Ro<f>A(T? floTr)-12f )  s 

(2.30) 

= 

r 

0,  As,  R0As  +  A R0S,  •  •  •  ,  ^  R^AR^s 

k=  1 

(2.31) 

Proof:  See  Appendix  2.7. 

Q.E.D. 

When  the  perturbation  vanishes,  it  is  straightforward  to  show  from  (2.51)  of  Appendix  2.7 

that 

mMF  -  Wr+1  =  R0~5p±rRQ  2s  =  0  (2.32) 

which  offers  another  proof  that  the  CG-MF  detector  mr+1  converges  to  the  MF  detector  iuMF 
when  A  =  0,  a  result  we  already  know  for  the  CG  algorithm  [10].  Interestingly,  the  first  equality 
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of  the  above  equation  resembles  (2.26)  except  that  d  is  replaced  by  s.  It  is  also  noted  that  the 
matrix  <E»a  can  be  easily  calculated  by  the  following  simple  recursion: 


0i  =0 

0fc  =ARo  2s  +  -Ro0fc-i,  k  =  2,  3,  •  •  •  ,  r  +  1. 

We  now  consider  the  output  SINR  of  the  CG-MF  detector  wr+ 1, 

Pr+ 1 


(2.33) 


I  w  ^  s  P 
|2  lU7r+l*l 


W?+1Rwr+ 1 


(2.34) 


and  its  relation  to  the  output  SINR  of  the  MF  detector.  The  following  result  addresses  their 
relationship. 

Theorem  2  Under  the  same  conditions  stated  in  Lemma  1,  the  output  SINRs  of  the  MF  detector 
iuMf  and  the  CG-MF  detector  wr+l  are  identical  within  a  first-order  approximation.  That  is, 


tip  —  PMF  ~  Pr+ 1  “  o(||A| 


where 


Pr+ 1 


\qi)H  o|2 

2  l^V+l^l 
w¥,,Rwr 


(2.35) 


(2.36) 


7r_|_-^  JX07r_|_l 

and  pmf  is  similarly  defined  by  replacing  wr+l  with  w mf- 

Proof:  The  proof  goes  by  direct  calculation  and  using  Lemma  1.  The  loss  of  output  SINR  of 
the  CG-MF  relative  to  the  MF  is  given  by 


SP  —  Pmf  ~  Pr+ 1  —  0| 


I  u;MFhl 


\w"+1s\2 


w"fRwmf  wf+1Rwr+1 


(2.37) 


First  we  consider  the  difference  between  sHwM f  and  sHwr+ 1.  Using  Lemma  1,  we  have 


s  wMF  -  sHwr+1  =  sHR0  2 P4  R0  2d  +  o(||A| 


(2.38) 


Since  s  is  orthogonal  to  the  column  space  of  R0  2P;~,  R0  2  [also  see  (2.32)],  we  have 


and  (2.38)  reduces  to 
It  follows  that 


shR--2P±R~-2  =  0 

(2.39) 

shWmf  -  sHwr+ 1  =  o(||A||). 

(2.40) 

Wf02-Ki02  =  o(||A||). 

(2.41) 
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Next,  we  consider  the  difference  between  the  denominators  w^FRwMF  and  w^+1Rwr+1: 

w^FRwMF  -  w”+1Rwr+1 
=  2^ic{w^FR(wMF  -  mr+i)} 

-  (>MF  -  Wr+1)H R(wmf  -  Wr+ 1) 

=  2  9\z{sH  (w  —  tur+1)} 

-  (wM F  -  wr+1)H R(wmf  -  wr+ 1)  (2.42) 

where  93e{-}  denotes  the  real  part.  Again  using  Lemma  1,  we  have 

Omf  -  wr+1)HR(wM F  -  wr+ 1)  =  o(||  A||).  (2.43) 

Substituting  (2.40)  and  (2.43)  into  (2.42)  yields 

w^fRwmf  -  Wr+1Rwr+ 1  =  o(||  A II).  (2.44) 

Finally,  from  (2.37),  the  output  SINR  loss  of  the  CG-MF  detector  is  given  by 

,  2\w^Fs\2w^+1Rwr+1  -  \w?+1s\2w£fRwmp 
p  |a|  w»FRwMFw?+1Rwr+1  '  ‘ 

Substituting  (2.41)  and  (2.44)  into  the  numerator  of  (2.45),  we  have 

\w^Fs\2w^+1Rwr+1  -  \w^+1s\2w^fRwmf 

=  |iumFs|2  (wfipRWMF  -  o(||A||)) 

-  (kMFS|2  -  o(||  AID)  ^MF-R^MF 

=  o(ll  All)  (2.46) 


from  which  (2.35)  immediately  follows.  Q.E.D. 

Remark:  It  is  interesting  to  note  that  while  Lemma  1  indicates  that  the  difference  between  the 
weight  vectors,  i.e.,  mMF  —  wr+1,  contains  first-order  terms  of  the  perturbation,  such  first-order 
differences  vanish  in  terms  of  the  output  SINR,  as  shown  in  the  above  proof.  Theorem  2  implies 
that  that  the  probabilities  of  detection  of  the  MF  and  CG-MF  detectors  are  also  identical  within  a 
first-order  approximation.  This  has  important  practical  implication.  In  particular,  even  though  the 
CG  algorithm  using  R  generally  requires  full  (i.e.,  JN  )  iterations  before  it  reaches  convergence 
(since  R  does  not  have  the  low-rank  structure  (2.23)),  we  can  take  the  intermediate  result  wr+] 
obtained  at  the  (r  +  l)-st  iteration  and  obtain  nearly  the  same  detection  performance  as  the  MF 
detector,  provided  that  A  is  sufficiently  small.  While  the  latter  condition  cannot  be  quantitatively 
specified,  numerical  simulation  can  be  used  to  examine  the  effect  of  the  perturbation  size  and  the 
accuracy  of  the  approximation. 
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Normalized  Output  SINR  (N=32,J=4) 


Figure  2.1:  Normalized  output  SINR  versus  the  number  of  iterations  for  the  CG-MF  detector 

( J  =  4;  N  =  32). 

2.5  Numerical  Results 

2.5.1  General  Covariance  Matrix 

Computer  simulation  is  employed  to  verify  the  analytical  results  reported  in  the  previous  sections. 
We  first  consider  the  general  covariance  matrix  case  studied  in  Section  2.4.1.  Our  numerical 
results  for  this  case  use  a  disturbance  covariance  matrix  R  obtained  from  the  KASSPER  data 
set  [20],  which  is  a  simulated  data  set  that  includes  practical  airborne  radar  parameters  and  issues 
found  in  a  real-world  clutter  environment.  The  radar  platform  considered  in  this  data  set  has  1 1 
horizontal  antenna  elements.  For  simplicity,  we  use  only  the  outputs  of  the  first  J  =  4  channels 
for  processing.  The  number  of  pulses  is  N  =  32,  and  the  probability  of  false  alarm  is  Pfa  =  0.01. 
We  first  examine  the  output  SINR  pk,  defined  in  (2.18),  of  the  CG-MF  detector  wk-  Fig.  2.1 
shows  the  normalized  output  SINR  pi,/  Pmy,  where  the  normalizing  factor  pMF  is  the  output  SINR 
of  the  MF  detector,  versus  the  number  of  iterations.  It  is  observed  that  pk  converges  rapidly  to 
Pmf  as  k  increases. 

The  probability  of  detection  for  the  MF  detector  and  the  CG-MF  detector  after  k  =  10,  20 
and  40  iterations,  respectively,  is  shown  in  Fig.  2.2  as  a  function  of  the  MF  output  SINR,  defined 
as  pMF  =  \a\2sHR  1s.  It  is  seen  that  that  with  k  —  40  iterations,  the  CG-MF  detector  achieves 
nearly  identical  detection  performance  as  the  MF  detector,  which  requires  JN  =  128  CG  iter¬ 
ations.  Finally,  the  detection  probability  of  the  CG-MF  detector  as  a  function  of  the  number  of 
iterations  k  is  shown  in  Fig.  2.3  for  the  MF  output  SINR  =  5, 10,  and  20  dB,  respectively.  It  is 
seen  that  the  detection  probability  of  the  CG-MF  detector  is  a  non-decreasing  function  of  k  in  all 
cases,  as  predicted  in  Section  2.4.1. 
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Performance  of  MF  and  CG-MF  (N=32,J=4) 


Figure  2.2:  Probability  of  detection  versus  SINR  for  the  MF  and  CG-MF  detectors  with  several 
different  numbers  of  iterations  ( J  =  4;  N  =  32;Pfa  =  0.01). 


Probability  of  Detection  Pd  k  Versus  Number  of  Iterations  (N=32,J=4) 


Figure  2.3:  Probability  of  detection  Pd  k  versus  number  of  iterations  k  (J  =  4;  N  —  32;  Pfa  = 
0.01). 


2.5.2  Structured  Covariance  Matrix  with  Perturbation 


We  now  consider  the  case  examined  in  Section  2.4.2  where  the  covariance  R  is  a  perturbed 
version  of  a  structured  R0.  We  demonstrate  how  the  convergence  of  the  CG-MF  detector  is 
directly  affected  by  the  size  of  the  perturbation.  We  employ  a  relative  perturbation  size,  defined 
as 


tr(AHA) 

tr(R^Ro) 


(2.47) 
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Figure  2.4:  Output  SINR  of  the  MF  detector  and  CG-MF  detector  after  r  + 1  =  7  iterations  versus 
the  relative  perturbation  size  (r  =  6,  JN  =  16). 

where  tr(-)  is  the  trace  of  a  matrix.  We  tried  several  ways  of  generating  the  perturbation  matrix 
A  and  obtained  similar  results.  The  ones  presented  here  were  based  on  the  following  approach. 
For  any  structured  covariance  matrix  R0  as  described  in  (2.25)  and  a  given  perturbation  size  n, 
1)  randomly  generate  R  as  a  complex  Wishhart  matrix  R  with  mean  i?0;  and  2)  form  R  as  a 
scaled  version  of  R  such  that  A  =  R  —  R0  has  a  prescribed  perturbation  size  n.  It  is  noted  that 
although  R  is  generated  as  a  random  matrix  whose  mean  is  not  R0  (which  is  the  mean  of  R'), 
in  each  trail  R  is  treated  as  a  deterministic/known  matrix  that  is  a  perturbed  version  of  R0  with 
perturbation  size  k. 

Fig.  2.4  shows  the  output  SINR  of  the  CG-MF  detector  wr+ 1  (r  =  6)  normalized  by  that 
of  the  MF  detector,  i.e.,  (>r+\ / Pwv,  as  a  function  of  ap.  It  is  seen  that  the  output  SINRs  of 
the  two  detectors  remain  nearly  identical  (/Wi  //Air  >  0.99)  for  a  relative  perturbation  size  as 
large  as  ap  =  30%,  which  indicates  that  our  perturbation  analysis  in  Theorem  2  for  the  CG-MF 
detectors  is  quite  accurate  over  a  wide  range  of  perturbation  size.  Fig.  2.5  depicts  the  probability 
of  detection  for  the  MF  and  CG-MF  detector  as  a  function  of  the  MF  output  SINR,  where  several 
values  of  ap  are  considered.  It  is  seen  that  with  a  relative  perturbation  size  as  large  as  ap  = 
30%,  the  detection  probability  of  the  two  detectors  are  nearly  identical.  At  ap  =  45%,  a  small 
difference  is  observed. 


2.6  Concluding  Remarks 

The  CG  algorithm  can  be  used  to  solve  the  Wiener-Hopf  equation  underlying  the  MF,  which  leads 
to  a  family  of  linear  CG-MF  detectors  that  converge  to  the  MF  in  a  fixed  number  of  iterations. 
We  have  shown  that  the  CG-MF  detectors  are  all  CFAR  detectors,  they  can  be  recursively  and 
efficiently  computed  via  CG  iterations  over  an  expanding  Krylov  subspace,  and  each  of  them  is  an 
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Performance  of  CG-MF  with  perturbed  covariance  matrix  (r=6,Pfa=0.01) 


Figure  2.5:  Probability  of  detection  of  the  MF  detector  and  the  CG-MF  detector  after  r  +  1  =  7 
iterations  (r  =  6,  JN  =  16,  Pfa  =  0.01). 

optimum  reduced-dimensional  detector  in  the  sense  that  it  yields  the  maximum  output  SINR  over 
all  linear  detectors  residing  the  Krylov  subspace.  For  disturbance  covariances  with  a  low-rank 
structure  (rank-r),  we  have  shown  that  the  presence  of  a  perturbation  component  A  disrupting 
the  low-rank  structure  has  minimum  effect  on  the  convergence  of  the  CG  algorithm,  in  that  the 
output  SINR  of  the  (r  +  l)-st  CG-MF  detector  is  nearly  identical  to  that  of  the  MF  detector. 
This  offers  significant  computational  saving,  in  particular  when  r  is  small,  by  using  the  CG-MF 
instead  of  the  MF  detector  without  incurring  undue  penalty  in  detection  performance.  A  future 
topic  of  interest  is  to  analyze  the  CG  algorithm  for  adaptive  detection  when  the  covariance  matrix 
R  is  unknown  and  estimated  from  training  signals. 


2.7  Appendix:  Proof  of  Lemma  1 

Proof:  The  CG-MF  solution  wr+\  obtained  at  the  (r  +  l)-st  iteration  is  the  i?.- orthogonal  projec¬ 
tion  of  iuMF  onto  the  Krylov  subspace  tC{R,  s,  r  +  1)  [10].  This  means  that  the  i?.-norm  of  the 
approximation  error  is  minimized  over  all  vectors  in  JC(R,  s,r  +  1),  which  is  the  column  space 
of  Sr  =  [s,  Rs ,  R2s,  R3s,  •  •  •  ,  Rr s]  [10].  That  is, 


mMF  —  wv+i||.R 


min 

ak 


w  —  akRks 

k= 0 


R 


(2.48) 
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—  1  1  . 

Substituting  w  =  R  s  and  ||  •  \\r  —  R 2  (•)  into  (2.48),  we  have 

r 

R*{R-1s-'%2akRks) 

k= 0 
r 

R2s  —  akR2Rks 

k= 0 

The  minimum  approximation  error  is  achieved  if  and  only  if  the  vector  Y7k= o  cikR2Rks  is  the 
orthogonal  projection  of  the  vector  R~2s  onto  the  linearly  transformed  Krylov  subspace 

R2IC(R,  s,  r  +  1)  =  span{i?2s;  RiRs,  ■  ■  ■  ,R2Rs}  (2.50) 

or  the  column  space  of  Sr  =  R2  Sr.  When  the  minimum  of  (2.49)  is  achieved,  the  approximation 
error  is  given  by 

_  1  |  _  1 

WMF  —  Wr+l  —  R  2PgR  2s  (2.51) 

where 

P±r  =1-  Sr(Sf  Sr^Sf 

=  1-  R*Sr(S? RS^S? R^  (2.52) 


which  is  the  orthogonal  complement  projection  matrix  of  the  transformed  Krylov  subspace 
R2JC(R ,  s,  r  +  1).  Substituting  (2.52)  into  (2.51),  we  have 


wMF  -  wr+i  =  ( R -1  -  SrffiRSj-'S?)  s. 

(2.53) 

Since  wMF 

=  R  ks,  and  the  vector  Sr(S^ i?5r)_15f  s  e  IC(R,  s,  r  +  1),  so 

and 

Wr  =  SriS^RSr^S^S, 

(2.54) 

wMF  -  wr+1  =  ( R -1  -  Sr(S? i?5r)_15f )  8. 

(2.55) 

Expanding  Rm  =  (R0  +  A)m,  we  have 

m 

Rm  =  R™  +  ^  R^-b ARq-1  +  o(||  A||) 

k= 1 

m 

«  R™  +  R™~k ARq-1  .  (2.56) 

k= 1 

If  the  columns  of  Tr  span  the  Krylov  subspace  JC(R0 ,  s,  r  +  1),  then  Sr  can  be  approximated  by 

Sr  «  Tr  +  (2.57) 


mMF  -  «Vh||r  =  min 

ak 


=  min 

ak 
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where  <&a  is  defined  by  (2.31).  After  substituting  (2.57)  into  (2.55),  while  using  a  first-order 
expansion  on  (S^ilSV)-1,  (Sf  RSr)1  can  be  approximated  as 

(S^RSr)-1  =  ((TV  +  $A)^(i*o  +  A  )(Tr  +  $a))_1 
~  (T^RoTr)-1  -  {T^R0Tr)-l^^RQTr)-1  (2.58) 

where  \I>a  =  T^Ro® a  +  Tf  ATr  +  <E»A_RoTr.  Similarly  using  expansion  on  i?  1 ,  we  have 

ir1  =  (P0  +  A)-1  «  P0  1  -  R^ARo1.  (2.59) 

Substituting  (2.58)  and  (2.59)  into  (2.55),  and  discarding  the  second  and  higher-order  terms,  we 
have 


WUF  -  Wr+1  «( R0  1  -  T r{T^ RqT r)  lT f 

+  Tr(Tf  PT,.)-1 

-  QAiTfRoTj-'T?  -  R0  lAR0 1 
-Tr(TfJR0Tr)-1$f)S.  (2.60) 

Since  P0  is  a  rank-r  correction  matrix  to  cr(i\  the  solution  P()  1  s  lies  in  the  Krylov  subspace 

_  1 

JC(Ro,s,r  +  1)  [10].  Hence,  R0  2s  lies  in  the  linearly  transformed  Krylov  subspace 

i  _i 

Rq1C(R0,  s,  r  +  1)  or  the  column  space  of  Tr  =  R(2Tr,  and  as  such  R0  2s  is  equal  to 
zero  vector,  i.e. 

=  (/  -  RlTr((RlTr)H(RlTr))-1T^Rl)R0K 

=  (iV -  RlTr(T»RQTr)-lT»)s  =  0.  (2.61) 

_  1 

Since  R0  is  a  positive-definite  Hermitian  matrix,  left  multiplying  both  sides  of  (2.61)  by  R0  2 
yields 

(PV1  -  Tr(Tf R0Tr)-lT?)  s  =  0.  (2.62) 

Substituting  (2.62)  into  (2.60),  we  obtain  the  difference 


WMF  -  Wr+ 1  « 

(rr{T?  RoTr)-\T^  P0$a  +  Tf  ATr  +  $£ R0Tr ) 
x  (if.  P0T,)_1Tf  -  $A(T"i20Tr)-17f 

-  P0  1 AP0  1  -  Tr(Tf  PoTr)-^f )  s.  (2.63) 
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Note  that 


Tr(Tf  R0Tr)-1T?ATr{T?  R0T  ,)~lT^  s 

=  Ro  2pTrR o  5  A-Ro  2pTrRo  2s  (2-64) 

where  Pj,  =  Tr(Tr  Tr)~lTr  is  the  orthogonal  projection  matrix  onto  the  column  space  of 
~  _  1  „ 

Tr.  From  the  previous  analysis,  the  vector  R{)  2  s  lies  in  the  column  space  of  Tr.  Therefore, 

Pj,R02s  =  R02s.  (2.65) 

Substituting  (2.65)  into  (2.64),  we  have 

T  r{T^  RqT  rylT^  AT  r{T^  RqT  r)~lT^  s 
=  R0  2PfrR0  2  AR0 1s  (2.66) 

and 


[R01AR01  -  Tr(Tf R0Tr)~lT* ATr 
x  (If  fl0Tr)-1T*)a 

=  R0h-P±RPAR01s. 

Similarly,  it  can  be  be  proved  that 

(®a(T?r0  Tr)~lT* 

-  Tr(Tf )  s 
=  R0 l2P±Rl<f>A(T»  RoT^T^s 

and 

(rr(T»R0T,r)-l<s>l 

-Tr(TfJR0Tr)-1$fJR0Tr(TfJR0Tr)-1Tf)S 
=  T^RoTry^lP^R^s  =  0. 

Substituting  (2.67)-(2.69)  into  (2.63),  we  have 

-i  ,  _i 

WMF  -  Wr+ 1  «  R0  2Pt  R0  2 

(Ai?Q 1  +  R0®a(T? i20Tr)"1lf )  s 
=  R-JP^R-Jd 

where  d  is  defined  by  (2.30). 


(2.67) 


(2.68) 


(2.69) 


(2.70) 

Q.E.D. 
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Chapter  3 


Conjugate  Gradient  Parametric  Matched 
Filter 


3.1  Introduction 

With  extra  spatial  information  provided  by  multiple  sensors,  higher  performance  of  signal  de¬ 
tection  can  be  achieved  (than  a  single-sensor  system),  especially  in  detection  of  signals  buried 
in  a  background  of  directional  jammers  and  space-time  correlated  clutter.  A  widely  explored 
technology  for  multichannel  signal  detection  is  space-time  adaptive  processing  (STAP)  [1],  first 
proposed  by  Brennan,  Reed  and  Mallett  [2].  Most  STAP-based  methods,  such  as  the  adaptive 
matched  filter  (AMF)  [3]  and  Kelly’s  generalized  likelihood  ratio  test  (GLRT)  [4],  need  to  invert 
a  large  space-time  covariance  matrix.  These  methods  require  not  only  a  large  number  of  indepen¬ 
dent,  identically  distributed,  signal-free  training  data  to  estimate  the  matrix,  but  they  also  incur  a 
high  computational  cost  for  matrix  estimation  and  inversion. 

A  parametric  STAP  detector  based  on  a  multichannel  autoregressive  (AR)  disturbance  model 
has  been  proposed  for  airborne  radar  applications  [16,  21]  to  reduce  both  the  training  data  re¬ 
quirement  and  computation  load.  This  method  is  called  the  parametric  adaptive  matched  fil¬ 
ter  (PAMF)  [16].  While  the  PAMF  detector  has  been  found  to  yield  exceptional  performance 
with  significantly  reduced  training  and  computational  requirements  when  compared  with  fully 
adaptive  STAP  detectors,  the  connections  between  the  PAMF  and  other  reduced-dimensional  or 
partially  adaptive  STAP  detectors  [1],  which  have  similar  benefits  in  training  and  complexity, 
remains  unclear. 

This  chapter  aims  to  provide  some  insights  into  this  problem  by  employing  the  conjugate- 
gradient  (CG)  method  to  solve  the  linear  prediction  problem  underlying  the  temporal  whitening 
phase  of  the  PAMF  detector.  Our  choice  of  the  CG  method  is  motivated  by  several  factors.  First, 
as  will  be  shown,  the  CG  algorithm  naturally  leads  to  a  subspace  interpretation  of  the  PAMF 
detector,  and  offers  a  connection  to  the  other  reduced-rank  STAP  detectors.  Second,  the  CG 
method  is  a  computational  efficient  algorithm  to  solve  the  linear  prediction  problem  underlying 
the  PAMF  detector.  In  particular,  for  airborne  radar  applications,  due  to  an  inherent  structure  of 
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the  disturbance  covariance  matrix,  the  CG  algorithm  can  usually  achieve  convergence  using  only 
a  few  iterations,  thus  providing  significant  computational  saving.  Third,  since  the  disturbance 
covariance  matrix  has  a  block- Toeplitz  matrix  structure,  preconditioning  methods(e.g.,  [10,22, 
23])  can  be  employed,  which  are  very  effective  in  further  speeding  up  the  convergence  rate  in 
CG  iterations,  while  adding  up  only  a  modest  computational  overhead  per  iteration  (due  to  the 
block- Toeplitz  structure).  Finally,  as  a  by-product,  we  show  that  the  CG  algorithm  also  yields  a 
new  and  computationally  efficient  AR  model  order  selection  method  that  can  be  integrated  with 
the  CG  iterations. 

The  remainder  of  this  chapter  is  organized  as  follows.  The  signal  detection  problem  is  intro¬ 
duced  in  Section  3.2.  A  brief  review  of  the  matched  filter  (MF)  and  parametric  matched  filter 
(PMF)  detectors  is  provided  in  Section  3.3.  In  Section  3.4,  the  CG  versions  of  MF  (CG-MF)  and 
PMF  (CG-PMF)  and  a  CG-based  model  order  selection  method  are  proposed.  The  convergence 
rate  of  CG  in  airborne  radar  applications,  along  with  a  preconditioned  CG-PMF  (PCG-PMF)  de¬ 
tector,  are  also  discussed  there.  In  Section  3.5,  we  consider  the  adaptive  case  and  present  a  new 
model  order-selection  CG-PAMF  (OSCG-PAMF)  detector,  when  both  the  AR  model  order  and 
coefficients  are  unknown.  The  performance  of  the  proposed  class  of  CG-PMF  and  CG-PAMF 
detectors  is  illustrated  by  numerical  results  in  Section  3.6.  Finally  conclusions  are  summarized 
in  Section  3.7. 

Vectors  and  matrices  are  denoted  by  boldface  lower-case  and  upper-case  letters,  respectively. 
Transpose,  complex  conjugate  and  complex  conjugate  transpose  are  respectively  represented  by 
(•)T,  (•)*  and  ( -)H.  C  and  M  denote  the  complex  and  real  number  fields.  CM(n,  R)  denotes  an 
additive  multivariate  Gaussian  random  variable  with  mean  vector  /i  and  covariance  matrix  R. 


3.2  Data  Model 

Consider  a  received  ./-channel  sequence  (x(n)|n  =  1,2,  •••  ,N}  corrupted  by  a  space-time 
correlated  disturbance  random  process  c (n) .  The  detection  problem  involves  the  following  binary 
hypotheses: 


H0  :  x(n)  =  c(n) 

Hi  :  x(n)  =  as(n)  +  c (n)  (3.1) 

where  s (n)  is  a  known  ./-channel  signal  and  a  is  its  deterministic  but  unknown  complex  ampli¬ 
tude.  All  vectors  in  (3.1)  are  J  x  1  vectors.  For  convenience  of  later  discussions,  define  the  fol¬ 
lowing  vectors  in  descending  order:  s  =  [sT(lV),  sT(N  —  1),  •  •  •  ,  sT(l)]T,  c  =  [cT(lV),  cT(V  — 
1),  •  •  •  ,  cT(l)]T,  x  =  [xT{N),  xT(lV  —  1),  •  •  •  ,  xr(l)]3  .  It  is  standard  to  assume  that  the  dis¬ 
turbance  c  is  a  Gaussian  random  vector  with  zero-mean  and  space-time  covariance  matrix  Rc  e 
C JNxJN ^  yvhpg  the  signal  vector  s(n)  is  deterministic  (Swerling  0  target).  Based  on  these  as¬ 
sumptions,  x  ~  CM  {as,  Rc),  where  a  =  0  under  H0  and  a/0  under  Hi. 

In  STAP,  the  signal  s  is  known  as  the  space-time  steering  vector: 

s  =  st(g)ss  (3.2) 
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where  st  and  s,  denote  the  temporal  steering  vector  and  spatial  steering  vector,  respectively, 
and  (8)  denotes  the  Kronecker  product.  For  a  side-looking  uniform  linear  array  (ULA),  we  have 
st  =  (l/V^V)[e*27r^JV_1^d,  •  •  •  ,e*27r^d,l]r  with  a  normalized  Doppler  frequency  fa  and  ss  = 
(l/v/J)[e!2,r^_1^s,  •  •  •  ,  el27VP ,  1]T  with  a  normalized  spatial  frequency  fs.  Practically,  the  true 
disturbance  covariance  matrix  Rc  is  unknown,  and  often  an  estimate  can  be  obtained  from  the 
secondary  data: 

1  K 

Rc=— ^cfccf  (3.3) 

Y  k= 1 

where  c k,k  =  1,  2  ■■■,/!,  denote  the  secondary  data  vectors  assumed  to  be  signal-free.  Ac¬ 
cording  to  the  well-known  “RMB”  rule  [24],  we  need  K  >  2JN  —  3  so  that  the  average  output 
signal-to-interference-plus-noise  ratio  (SINR)  loss  caused  by  covariance  estimation  error  is  less 
than  3  dB.  Detectors  with  an  estimated  covariance  matrix  are  often  called  adaptive  methods. 


3.3  MF  and  PMF 


Assuming  a  known  Rc,  the  matched  filter  (MF)  is  obtained  by  maximizing  the  output  SINR  of  a 
linear  receiver  or  the  generalized  likelihood  ratio  (GLR).  The  test  is  given  by  (e.g.,  [3]): 


s^R-^l2//! 

— tt- p: — ] —  $  Vmf 

SHR-1S  Ho 


(3.4) 


where  rj mf  is  the  threshold  of  the  MF.  Equation  (3.4)  is  the  well  known  matched  subspace  detector 
for  a  rank-1  signal  in  colored  noise.  Consequently,  it  offers  unbeatable  performance  for  the 
detection  problem  considered  in  equation  (3.1). 

For  ease  of  exposition,  the  MF  can  also  be  represented  by  using  a  structure  of  temporal 
whitening  cascaded  with  spatial  whitening  arising  from  a  block  LDU  decomposition  of  the  dis¬ 
turbance  covariance  matrix  [16].  This  form  of  MF  is  given  by 

|(Q-1/2L-1s)«(Q-1',2e)|2  |s«i/|2«, 

(Q-'/2L-1s)»(Q-1/2L-1s)  s"s  jfo',MF 


where  Q  e  qJNxjn  js  a  block-diagonal  matrix  with  Hermitian  matrices  Q (n),n  =  1,  2,  •  •  •  ,  N, 
along  the  main  block  diagonal,  and  L  G  CJNxJN  is  a  lower  block-triangular  matrix  with  J  x 
J  identity  matrices  along  the  main  block  diagonal.  Both  L  and  Q  come  from  a  block  LDU 
decomposition  of  the  disturbance  covariance  matrix  Rc  =  LQL”.  Finally, 


(n-l) 

e(n)  =  x(n)  -  ^  A^ (p)x(n  -  p) 
p= i 


s  (n) 


u{n)  =  Q  1^2(n)e(n) 


Q-1/2(n) 


(n-l) 

S(n)  “  An(p)S(n~P ) 

P=  1 


(3.6) 

(3.7) 

(3.8) 
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where  A %(p)  G  CJxJ  is  a  block  element  of  L  1  located  at  the  (n  —  p)-th  block  column  and  the 
n-th  block  row.  Due  to  the  fact  that  there  is  no  performance  penalty  for  the  prewhitening  of  the 
interference  [25,  Ch.6],  (3.5)  is  equivalent  to  (3.4). 

If  the  disturbance  c(n)  is  stationary  in  time,  the  MF  can  be  simplified.  A  parametric  matched 
filter  (PMF)  was  introduced  in  [16]  by  modeling  the  disturbance  as  a  stationary  P-th-order  mul¬ 
tichannel  autoregressive  (AR)  process.  Specifically, 

p 

c(n)  =  Aif(p)c(n  —  p)  +  eP(n )  (3.9) 

p=  i 


where  A H(p),p  =  1, 2,  •  •  •  ,  P,  is  the  p-th  AR  matrix  coefficient  of  linear  prediction,  and  eP(n ) 
is  the  temporally  white  noise  with  a  spatial  covariance  matrix  QP.  The  PMF  test  is  given  by  [16] 


v  N 


; H 


p+is p  {n)uP{n) 


aH  ( 
P+1 ®P 1 


n)sp(n) 


JH  i 

"  ^  Vpmf 
Ho 


(3.10) 


where  isp(n) 


£p(n) and 


sp(n)  =  Q 


-1/2 


s  in 


~  ^2  AH (p)s(n  -  p) 


p= i 


(3.11) 


for  n  —  P  +  1,  •  •  •  ,  N.  In  practice,  the  model  order  P  and  the  AR  coefficients  { A  (yi ) }  are 
unknown  and  hence  estimated  from  the  secondary  data  and/or  primary  data.  Different  estimators 
lead  to  different  versions  and  implementations  of  the  PAMF  detector  [16, 17]. 


3.4  CG-MF  and  CG-PMF 

In  this  section,  we  discuss  alternative  implementations  of  the  MF  and  PMF  via  the  CG  algorithm. 
The  resulting  detectors  are  referred  to  as  the  CG-MF  and  CG-PMF  detectors,  respectively,  for 
brevity.  We  start  from  the  CG-MF,  which  also  sets  the  basis  for  the  CG-PMF.  The  latter,  by  as¬ 
suming  that  the  disturbance  c (n)  is  temporally  stationary,  is  a  computationally  simplified  version 
of  the  CG-MF.  The  link  between  the  PMF  and  CG  as  developed  in  the  sequel  reveals  the  PMF 
as  a  reduced-dimensional  subspace  detector.  In  this  section,  we  assume  knowledge  of  the  covari¬ 
ance  matrix  of  the  disturbance  signal.  An  adaptive  versions  of  the  CG-PMF  (i.e.,  CG-PAMF)  is 
discussed  in  Section  3.5. 

3.4.1  Conjugate-Gradient  MF 

The  MF  detector,  as  shown  in  Section  3.3,  can  be  derived  from  a  time- varying  linear  prediction 
process.  Specifically,  consider  the  problem  of  linearly  predicting  the  n-th  sample  x(n)  under  H0 
from  all  prior  received  samples  x(n  —  1),  x(n  —  2), ... ,  x(l)  (cf.  (3.9)) 

x(n)  =  B^(n)y(n)  +  e(n)  (3.12) 
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where  B(n)  =  [A*(l),  A*(2),  •  •  •  ,A*(n-l)}H  =  [B^n),  B2(n),  •  •  •  ,Bj(n)]  G  Cj^xj 

denotes  the  (n  —  l)-st  order  time- varying  multichannel  linear  prediction  filter,  and  y (n)  = 
[yn(l),yn(2),  •  •  •  ,yn(J(n  -  1)]T  =  [xT(n  -  l),xT(n  -  2),  •  •  •  ,xT(l)]T  contains  all  n  -  1  pre¬ 
viously  received  data  vectors.  It  is  noted  that  the  above  time-varying  linear  predictor  grows  in  its 
filter  order  or  size  with  n.  The  multichannel  linear  predictor  is  equivalent  to  J  linear  predictors: 

Xj(n)  =  Bf(n)y(n)  +  £j(n),  j  =  1, 2,  —  ,  J  (3.13) 

where  B?  (n)  is  a  J{n  —  1) -dimensional  vector  which  contains  the  cross-channel  correlation  in¬ 
formation  associated  with  the  j-th  channel.  The  optimum  linear  predictor  can  be  obtained  by 
solving  the  Wiener-Hopf  equations: 

Ry(n)Bj(n)  =  Rj(n),  j  =  1,2,  •  •  •  ,  J  (3.14) 

where  Ry(n)  =  E[y(n)yH (n)\  G  and  Rj(n)  =  E[y(n)x*(n )]  G  CJ(n_1)xl. 

Again,  note  that  the  size  of  the  Wiener-Hopf  equation  grows  with  n. 

To  obtain  a  temporally  whitened  sequence  e(n)  for  MF  detection  (cf.  (3.6)),  the  above  linear 
prediction  process  has  to  be  performed  multiple  times,  starting  from  n  —  2  to  n  —  N.  For  each 
n,  we  need  to  solve  a  Wiener-Hopf  equation  of  the  form  (3.14).  While  there  are  various  solvers  to 
the  linear  Wiener-Hopf  equation,  we  consider  using  the  conjugate  gradient  (CG)  method,  which 
has  several  properties  such  as  fast  convergence,  a  direct  link  to  the  Krylov  subspace  [10],  and  a 
built  in  model  order  selection  capability.  Additional  remarks  on  such  aspects  are  provided  shortly. 

The  recursive  procedure  involved  for  the  determination  of  the  linear  predictors  is  described 
as  follows  (also  see  (3.9)). 
for  n  =  2  to  N  do 
for  j  =  1  to  J  do 

Initialization.  Initialize  the  conjugate-direction  vector  D0y(n),  gradient  vector  7 ,  y (n) 
and  initial  solution  B0y(n): 


Dij(n)  =  — 7iy0)  =  Rj(rc) 
B0  j  (n)  =  0. 

for  k  —  1, 2,  •  •  • ,  till  convergence  (k  <  J(n  —  1))  do 
Update  the  step  size  akj: 


OLkjijl) 


llTfcjWIP 


1 (//)!? ,(//)!)/,,(//) 


Update  the  solution  B  k  f. 

B  kj{n)  =  Bfe_i  j(n)  +  akd(n)DkJ(n). 


Update  the  gradient  vector  7fc+l  j: 

lk+iAn)  =  (w)  +  oikj(n)  Ry(n)DfcJ(n). 


(3.15) 

(3.16) 

(3.17) 

(3.18) 

(3.19) 
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Figure  3.1:  Time- varying  linear  prediction  in  the  conjugate-gradient  MF  detector. 


Update  the  conjugate-direction  vector  Dfc+lj-: 


ll7fc+i,j(X>l|2 

llTfcjWII2 


Tfc+ijW- 


(3.20) 


end  for 
end  for 
end  for 

Let  B  ( n )  be  the  multichannel  linear  predictor  formed  from  B kj  after  convergence.  Then, 
B  {n)  can  be  used  to  whiten  x(n)  to  produce  a  temporally  whitened  sequence  e(n).  The  spatial 
covariance  matrix  Q(n)  of  e(n)  is  given  by  (cf.  (3.15)) 

Q  (n)  =  E[e(n)£H  (n)] 

=  Rx(n)  -  B^(n)Ryx(n)  (3.21) 

where  Rx(n)  =  E[x(n)x.H  (n)}  G  CJxJ,  and  Ryx(n)  =  E[y(n)x.H  (n)]  £  <£J(n-i)xj ,  [ s 

used  for  further  spatial  whitening  [16]. 

Fig.  3.1  depicts  the  CG-MF  detector  that  produces  the  n-th  sample  of  the  temporally  whitened 
sequence  £j[n)  for  the  j-th  channel,  where  Dfcj(n)  =  [Dij(n),  D2j(n),  •  •  •  ,  Dfej-(n)]  is  the 
conjugate-direction  matrix.  CG  iterations  lead  to  a  set  of  linearly  independent  vectors 
D  |  .j  (n) , ....  D k,j(n)  that  are  conjugate  orthogonal ,  i.e. 

DfJ.(n)Ry(n)Dij-(n)  =  0,  k  ±  l.  (3.22) 

The  output  of  the  k- th  iteration  is  given  by 

k 

Bk,j(n)  =  ^2  am,ji,n)'Dmd(n)  (3.23) 

m= 1 

which  is  a  vector  in  the  A  -dimcnsional  vector  space  spanned  by  the  conjugate-direction  vectors 
{Dm;  j(n),m  =  1,2,---  ,  k}.  The  iterative  procedure  for  the  prediction  of  the  n-th  sample  Xj{n), 
which  involves  a  J{n  —  l)-st  order  linear  predictor,  converges  after  at  most  J(n  —  1)  iterations. 
The  final  solution  B j(n)  lies  in  a  J{n—  1) -dimensional  vector  space. 
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Figure  3.2:  Time-invariant  linear  prediction  in  the  conjugate-gradient  PMF  detector. 

3.4.2  Conjugate-Gradient  PMF  with  Known  AR  Model  Order 

If  the  disturbance  signal  can  be  approximated  as  a  temporally  wide-sense  stationary  (WSS)  multi¬ 
channel  AR  process,  the  linear  prediction  problem  of  the  previous  subsection  can  be  significantly 
simplified.  Specifically,  suppose  the  disturbance  is  an  AR(P)  process  with  model  order  P.  In 
this  case,  the  optimum  linear  predictor  for  the  n-th  sample  x(n)  requires  only  P  most  recently 
received  samples  (as  opposed  to  all  past  samples)  and  the  prediction  filter  is  time-invariant  with 
a  fixed  size  (as  opposed  to  time-varying  with  a  growing  size)  [26] : 

x(n)  =  BHy  P(n)  +  £P(n )  (3.24) 

where  the  fixed  P-th  order  linear  predictor 

B  =  [Ap(l),  Ah(2),  ■■■  ,  A  H(P)JH  =  [B1;  B2,  •  •  •  ,  Bj]  G  CJPxJ  (3.25) 

is  composed  of  the  AR  coefficient  matrices  (A H(p)}  (cf.  (3.9)), 

y p(n)  =  [yn(l),yn(2),  ■  ■  ■  ,yn(JP)]T  =  [xT(n  -  1  ),xT(n  -  2),  •  •  •  ,xT(n  -  P)]T  (3.26) 

denotes  the  regression  data  vector,  and  n  >  P.  Again,  it  is  convenient  to  express  the  above 
multichannel  linear  predictor  as  J  scalar  linear  predictors: 

Xj(n)  =  Bf  yP(n)  +  £pj(n),  j  =  1,  2,  ■  ■  •  ,  J.  (3.27) 

The  structure  of  temporal  whitening  via  linear  prediction  for  the  PMF  detector  is  shown  in 
Fig.  3.2. 

The  solution  to  the  scalar  linear  prediction  problem  can  be  obtained  by  solving  the  following 
Wiener-Hopf  equation 

ByBj  =  R?,  j  =  1,  2,  •  •  •  ,  J  (3.28) 

where  Ry  =  E[yP(n)yp(n)\  G  CJPxJP  and  R j  =  E[y P{n)x*{n)]  G  CJPxl.  It  should  be  noted 
that  unlike  the  MF  detector,  the  above  Wiener-Hopf  is  time-invariant,  has  a  fixed  size,  and  needs 
to  be  solved  only  once.  The  resulting  solution  B;  can  be  used  to  whiten  the  entire  received  signal 
x(n)  for  n  >  P.  The  CG  algorithm  can  also  be  applied  to  solve  (3.28),  and  the  resulting  detector 
is  referred  to  as  the  CG-PMF  detector.  Since  only  one  fixed-sized  Wiener-Hopf  equation  needs 
to  be  solved,  the  CG-PMF  detector  is  computationally  much  simpler.  Specifically,  the  outer  loop 
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for  varying  n  as  discussed  in  Section  3.4.1  vanishes,  and  only  the  conjugate-gradient  processing 
with  n  —  P  +  1  is  needed. 

Remark:  The  iterative  procedure  of  CG  converges  after  at  most  JP  iterations  for  the  CG- 
PMF.  As  a  result,  the  final  solution  By  lies  in  a  .//'-dimensional  vector  space  spanned  by  the 
conjugate  direction  vectors  D kj,  k  =  1,2,...,  JP,  or  equivalently,  the  .//'-dimensional  Krylov 
subspace  [10]: 

/C(Rj,  Ry,  JP)  =  span  {Rj,  RyRj,  •  •  •  ,  RyP_1R.,}  .  (3.29) 

This  shows  that  the  PMF  is  a  reduced  .//'-dimensional  solution,  as  opposed  to  the  full  JN- 
dimensional  MF  detector.  The  same  conclusion  applies  to  the  adaptive  version  CG-PAMF  detec¬ 
tor  discussed  in  Section  3.5. 


3.4.3  Model  Order  Selection  by  CG 

In  practice,  the  AR  model  order  P  of  the  disturbance  is  often  unknown  and  has  to  be  estimated. 
A  practical  approach  is  to  choose  an  upper  bound  P  for  P,  and  use  the  CG  algorithm  to  solve  the 
following  Wiener-Hopf  equation 

Rf)Bf)  =  Rf),  J  =  1,2,---  ,  J  (3.30) 

where  R^  =  E[yp(n)yp(n)]  G  Cj^xjp  and  R^  =  E[yp(n)x*(n)}  G  CJPxl.  The  CG 

iterative  procedure  will  converge  after  at  most  JP  iterations  with  B'/  1  =  | Bf , 07/( x  j  1 . 
However,  with  a  loosely  determined  upper  bound  P,  it  is  often  necessary  for  the  sake  of  reducing 
computational  complexity  to  stop  the  CG  iterations  before  it  reaches  the  maximum  number  of 
iterations.  In  this  section,  we  propose  a  model  order  selection  method  for  use  with  the  CG 
algorithm,  which  is  based  on  the  following  result. 


Lemma  2  Suppose  the  disturbance  in  (3.1)  is  a  J -channel  AR(P)  process.  Let  Bj/y  G  CJ^xl 
be  the  solution  to  (3.30)  obtained  by  CG  at  the  k-th  iteration,  where  k  =  Jp  and  p  <  P.  Let 
B(p)  G  C Jpx  1  be  the  solution  to  RyP)B(/,)  =  R(;,).  Then  we  have 


BfJ  =  \\V;Bf .  when  p  =  P 


(3.31) 


where  W/,:J  =  I  I  )\[r  D[.^  =  | D , ^ ,  D^,  •  •  •  ,  D^]  is  the  conjugate-direction  matrix,  and 
D  kj  is  a  k  x  k  matrix  composed  of  the  first  k  rows  o/'D/i:j  =  D]  ,r  D2j,  •  •  •  ,  Dfcj]  with 


D 


k,j 


DfJHRpDlfJ 


proof  See  Appendix  3.8. 


(3.32) 

Q.E.D. 
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From  (3.31),  when  p  =  P,  the  JP  x  JP  matrix  W JPj  transforms  B;  in  /C(R;. Ry.  JP ), 
which  is  generated  by  CG-PMF  with  a  known  P,  to  B^p  ■  in  /C(Rj,  Ry  \  JP),  which  is  gen¬ 
erated  by  CG-PMF  with  an  unknown  P.  So  the  PMF  AR  coefficient  vector  B;  is  completely 

(  p\ 

determined  by  the  truncated  solution  B  JP  •  of  CG  with  an  unknown  P  after  JP  iterations. 

We  now  explain  how  to  use  Lemma  2  for  AR  model  order  selection  in  CG-PMF.  Define  the 
residue  vector 

ey  =  Bg>  -  Dg'D"  Bf>,  for  k  =  Jp.  (3.33) 


According  to  (3.31),  ekj  =  0  when  k  =  JP,  so  the  norm  of  ek]  can  be  used  for  model  order 

(p) 

selection.  However,  since  the  Wiener  solution  B  is  practically  unknown,  cannot  be  di- 
rectly  computed  from  (3.33).  We  propose  an  approach  to  replace  B  j  in  (3.33)  by  the  truncated 
solution  composed  of  the  first  k  elements  of  Bfc  ’ ,  which  can  be  considered  as  an  approximation 

of[Bj,0  TJ(P_P)X1}T 


ek,j  ~ 

where  B^  contains  the  first  k  =  Jp 
procedure  is  summarized  as  follows. 


By  - 


elements  of  B 


(P) 

k,j  ‘ 


(3.34) 

Our  CG-based  model  order  selection 


•  Step  1:  Select  an  upper  bound  P  for  the  model  order.  One  such  an  upper  bound  suggested 
in  [16]  for  STAP  detection  is 


P  =  max 


(3.35) 


where  [-J  rounds  a  real-valued  number  towards  zero. 

•  Step  2:  Use  the  CG  algorithm  to  solve  the  Wiener-Hopf  equation  (3.30). 

-  Step  2.1:  Following  every  J  iterations  of  the  CG  algorithm,  compute  the  average 
residue  over  J  channels: 


=  7  SlI^'H2’  k  =  J,2J,...  (3.36) 

3= 1 

-  Step  2.2:  If  e 2Jp  is  smaller  than  a  specified  tolerance  level,  then  stop  the  CG  iteration, 
and  the  estimated  AR  model  order  is  P  =  p. 

The  advantage  of  the  above  CG-based  model  order  selection  method  is  that  it  does  not  require 
full  iterations  of  the  CG  algorithm  and  is  efficient.  The  complexity  of  the  CG  algorithm  with  full 
iterations  is  in  the  same  order  as  that  of  computing  the  inverse  of  RyP\  which  is  0(J3P3),  while 
the  complexity  of  using  the  CG-based  order  selection  method,  is  0(J3PP2).  This  is  because  the 
latter  only  requires  JP  iterations  to  determine  the  model  order,  and  the  additional  complexity 
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in  each  J  iterations  for  (3.34)  is  the  complexity  of  two  matrix-vector  multiplications,  which  is 
2(JP)2.  So  the  total  complexity  is  0(JP(JP )2  +  2 P(JP)2)  «  0(J3PP2).  Next,  we  compare 
the  computational  complexity  of  the  CG-PMF  with  an  unknown  P  with  the  complexity  of  the 
eigencanceler  [5],  which  is  a  standard  eigen-subspace  detector.  The  eigencanceler  method,  has 
a  complexity  of  9  J3N3  by  using  the  symmetric  QR  algorithm  to  obtain  the  eigen-subspace  and 
its  corresponding  eigen- values  [10].  Since  P  <  N,  and  generally  P  <C  N,  the  complexity  of 
CG-PMF  is  much  lower  than  eigencanceler. 

3.4.4  Convergence  in  Airborne  Radar  Applications 

One  important  property  of  the  CG  algorithm  is  its  fast  convergence.  In  general,  it  takes  no 
more  than  JP  iterations  to  solve  the  linear  equation  (3.28)  [10].  Even  faster  convergence  is 
possible  if  the  covariance  matrix  of  the  disturbance  has  some  specific  structure.  In  particular,  if 
the  covariance  matrix  is  a  rank-rc  correction  of  an  identity  matrix: 

Ry  =  Ri  +  <7 2 1  (3.37) 

where  R,  is  a  rank-rc  positive  semi-definite  matrix,  then  the  CG  algorithm  converges  in  at  most 
rc  +  1  iterations  [10]. 

In  airborne  radar  applications,  the  disturbance  covariance  matrix  often  consists  of  two  com¬ 
ponents,  namely  a  low-rank  Rj  due  to  the  clutter  and  jamming  and  a  scaled  identity  a2 1  due  to  the 
white  noise,  where  a2  denotes  the  noise  variance.  The  rank  rc  is  typically  much  smaller  than  the 
joint  spatio-temporal  dimension  JN .  Specifically,  if  the  disturbance  is  primarily  due  to  ground 
clutter  and  thermal  noise,  then  according  to  Brennan’s  rule  [2],  the  rank  of  the  clutter  covariance 
matrix  for  the  full-dimensional  MF  is  approximately 

f c,fuii  ~  \J+(N-m  (3.38) 

where  (3  =  2 vgTr/d,  vg  is  the  platform  velocity,  Tr  is  the  pulse  repetition  period,  d  is  the  antenna 
element  spacing,  and  [ •]  rounds  a  real- valued  number  towards  infinity. 

Likewise,  we  can  approximate  the  rank  of  the  disturbance  covariance  matrix  for  the  PMF 
detector  as 

rc«  \J+(P-l)Pl  (3.39) 

The  smaller  rank  rc  over  (3.38)  is  due  to  the  fact  that  the  disturbance  covariance  matrix  is  formed 
over  P  pulses,  which  is  sufficient  for  the  reduced-dimensional  PMF  detector  due  to  the  underlying 
AR (P)  model.  Meanwhile,  the  space-time  disturbance  covariance  matrix  for  the  full-dimensional 
MF  detector  is  formed  over  N  (the  entire  number  of)  pulses.  As  such,  the  PMF  can  benefit  more 
from  the  fast  convergence  property  of  the  CG  algorithm. 

3.4.5  Preconditioned  Conjugate-Gradient  PMF 

In  cases  where  the  disturbance  covariance  does  not  have  a  low-rank  structure  as  in  (3.37),  precon¬ 
ditioning  is  usually  helpful  in  improving  the  convergence  rate.  The  idea  is  based  on  the  fact  that 
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the  convergence  rate  of  CG  is  determined  mainly  by  the  eigenvalue  structure  of  Ry.  In  particular, 
the  residue  between  the  Wiener  solution  and  kth-step  conjugate  gradient  result  is  bounded  by  [10] 

||Bfcj  -  BjRy  <  2||B0j-  -  Bj||Ry  +  ^  (3.40) 

where  ||u;||Ry  =  denotes  the  Ry  norm  and  n  is  the  condition  number  of  Ry.  It  is 

clear  that  rapid  convergence  can  be  achieved  if  k  is  near  1.  In  the  following,  we  discuss  the  use 
of  preconditioning  with  the  CG-PMF.  For  simplicity,  the  resulting  detector  is  referred  to  as  the 
PCG-PMF  detector. 

Specifically,  consider  the  modified  Wiener-Hopf  equation  (cf.  (3.28)) 

RyBj  =  Rj  (3.41) 

where  Ry  =  M~^RyM  B,  =  Ad^Bj,  Rj  =  Ad^Bj,  and  M  is  a  Hermitian positive-definite 
matrix  that  is  called  preconditioner  [10].  The  preconditioner  is  used  to  yield  a  better  conditioned 
Ry,  which  has  a  smaller  condition  number  than  Ry,  and  thus  a  faster  convergence  rate.  For  PMF, 
the  disturbance  covariance  matrix  is  a  block- Toeplitz  (BT)  matrix.  For  such  matrices,  block- 
circulant  (BC)  preconditioners  are  often  recommended  [22, 23].  Our  BC  preconditioner  can  be 
directly  computed  from  the  disturbance  covariance  matrix  Ry  which  has  the  following  block 
Toeplitz  matrix  structure: 


where  R x(m) 
by  [27] 


R, 


Rx(0) 


Rx(R-l)' 


(3.42) 


_Rx(l  ~P)  Rx(0) 

£,[x(n)x'ff(n  —  m)]  G  CJxJ.  In  particular,  the  BC  preconditioner  is  given 


M  = 


M0  MP_!  •  •  •  Ad^ 
Adi  Ado  ■  •  •  Ad2 

Adp_i  Adp_2  •  •  •  Adp 


where 


Mk 


(P  -  fc)Rx(fc)  +  kRx(k  -  P ) 
P 


0  <  k  <  P.  (3.43) 

It  is  noted  that,  as  shown  in  [10],  practically  Ad^s  does  not  need  to  be  explicitly  calculated  in  the 
PCG  algorithm.  The  PCG  algorithm  is  summarized  as  follows. 
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Initialization.  Initialize  the  conjugate-direction  vector  D^-,  gradient  vector  7^-,  precondi¬ 
tioned  vector  zij  and  initial  solution  B0j-: 


7i  J  =  -Rj 

(3.44) 

D  ij  =  zij  = 

(3.45) 

Boy  =  0. 

(3.46) 

for  k  —  1,  2,  •  •  • ,  till  convergence  ( k  <  J(P  —  1))  do 

Update  the  step  size  a^y: 

“‘J  i>:  iu>,  ' 

(3.47) 

Update  the  solution  Bfcj-: 

B k,j  Bfc_ij  T  OfcyD/jj. 

(3.48) 

Update  the  gradient  vector  7/c+lj: 

7fc+iy  =  Tfcy  +  QAjRyDfcj. 

(3.49) 

Update  the  preconditioned  vector  zfc+lj: 

Zfc+ijj  =  M_17fc+1J. 

(3.50) 

Update  the  conjugate-direction  vector  Dfc+ij 


end  for 


Dfc-i-ij  ^k,j  “t~ 


Tfc+ijzfe+ij 
7 kjzkj 


(3.51) 


The  complexity  associated  with  the  AR  parameter  estimation  in  PCG-PMF  is  summarized 
in  TABLE  I,  where  r  is  the  number  of  iterations  needed  by  the  PCG  algorithm  to  reach  con¬ 
vergence,  and  the  flop  counts  are  for  all  J  channels.  It  is  interesting  to  note  that  the  PCG-PMF 
is  computationally  very  efficient,  involving  approximately  0(rJ’JP  log2  P).  The  computational 
efficiency  is  primarily  due  to  the  fast  convergence  rate  offered  by  preconditioning  and  the  use 
of  a  BC  preconditioner,  as  explained  next.  In  the  following,  we  discuss  the  complexity  of  only 
M-1,  (3.47)  and  (3.50),  since  the  other  calculations  are  obvious. 

First,  we  consider  M-1.  Since  M  is  a  block-circulant  (BC)  matrix,  the  inverse  of  M  can  be 
computed  by  using  the  Fast  Fourier  transform  (FFT)  [28] 


Co  Cp- 1  •  • •  C 1 
Ci  C 0  ■  ■ ■  C2 

Cp- 1  CP- 2  •  •  •  C0 


(3.52) 
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Table  3.1:  Complexity  of  PCG-PMF 


Equation 

Flops 

Remark 

(3.43) 

0(J2P) 

calculated  once 

AT1 

0(J2P  log9  P  +  J3P) 

calculated  once 

(3.47) 

0{J3P  log2  P) 

at  k- th  iteration 

(3.48) 

0(J2P) 

at  k- th  iteration 

(3.49) 

0(J2P) 

at  k- th  iteration 

(3.50) 

0(J3P  log2  P) 

at  k- th  iteration 

(3.51) 

0(J2P) 

at  k- th  iteration 

Total 

~  0(rJ3P  log2  P ) 

for  r  iterations 

where 

1  P_1 

Cm  =p2^  WpkmM^,m  =  0, 1,  •  •  •  ,  P  -  1  (3.53) 

k=0 

and  Wpk"‘  =  exp(j2krrnr/P).  It  follows  that  the  computation  of  M'1  is  composed  of  P  ma¬ 
trix  inversions  of  J  x  J  matrices  and  J2  FFTs  of  length  P.  Therefore,  the  total  complexity  is 

0{J2P  log2  P  +  J3P). 

Second,  we  consider  (3.47).  The  main  complexity  of  (3.47)  is  matrix-vector  multiplication 
RyDfcj.  Since  Ry  is  a  JP-dimensional  BT  matrix,  the  above  matrix-vector  multiplication  con¬ 
sists  of  J 2  Toeplitz  matrix-vector  multiplications,  where  each  Toeplitz  matrix  is  a  P  x  P  matrix. 
Each  Toeplitz  matrix- vector  multiplication  can  be  implemented  by  the  FFT  using  0(P  log2  P) 
flops  [29].  Hence,  the  complexity  of  (3.47)  for  each  channel  per  iteration  is  0(J2P  log2  P).  With 
J  channels  and  r  iterations,  the  total  complexity  of  (3.47)  is  0{rJiP  log2  P). 

Finally,  we  consider  (3.50).  Since  the  preconditioner  M  is  a  BC  matrix,  (3.50)  can  again  be 
computed  by  J 2  FFTs  of  length  P.  The  complexity  for  each  channel  per  iteration  is  0(J2P  log2  P ), 
so  the  total  complexity  of  (3.50)  for  J  channels  is  0(rJ3P  log2  P). 

Here,  we  make  a  comparison  between  the  PCG-PMF  and  CG-PMF.  Since  the  condition  num¬ 
ber  of  the  preconditioned  disturbance  covariance  matrix  Ry  is  generally  smaller  than  that  of  Ry, 
PCG-PMF  provides  a  faster  convergence  than  CG-PMF.  The  latter  has  a  complexity  of  0(J3P3). 


3.5  Conjugate-Gradient  PAMF 

The  CG-PMF  algorithm  is  now  extended  to  the  adaptive  case  when  both  the  covariance  matrix 
and  the  AR  model  order  P  are  unknown.  The  resulting  detector  is  referred  to  as  the  CG-PAMF 
detector.  The  extension  of  CG-PMF  involves  i)  replacing  the  true  covariance  matrices  with  esti¬ 
mates  obtained  from  the  target-free  training  data;  and  ii)  integrating  the  CG-based  model  order 
selection  proposed  in  Section  3.4.3  with  conjugate-gradient  iterations.  The  CG-PAMF  detector 
with  order  selection  (OSCG-PAMF)  is  summarized  next. 

•  Step  1:  Estimate  the  disturbance  covariance  matrices  from  the  training  data  via  temporal 


33 


and  range  averaging: 


Rx(0)  RX(P-1)' 

Rf}=  :  ;  0.54) 

RX(1  -p)  ■■■  Rx(0) 

'Rx(-i)' 

R(jp)  =  : 

yx 

_Rx(-P)_ 

where  the  sub-matrices  are  given  by 

1  K  N-m 

RxxM  =  v/x  X!  5Z  +  m)xf  (0 

fc=l  1=1 

with  K  denoting  the  number  of  training  data  vectors  and  P  determined  by  (3.35). 

•  Step  2:  Use  the  CG  algorithm  to  solve 

Rjf)B(p)  =  Rg).  (3.57) 

-  Step  2.1:  Examine  the  residual  eJp  (3.36)  at  each  Jp- th  (p  =  1,  2,  •  •  •  ,  P)  iteration  of 
CG.  If  eJp  <  a0ej(p_i),  where  0  <  a0  <  1  is  a  stopping  threshold,  then  exit  the  CG 
iteration,  and  set  the  AR  model  order  as  P  =  p. 

Unlike  the  original  PAMF  with  an  unknown  AR  model  order  [16],  which  has  to  run  recur¬ 
sively  from  p  =  1  to  a  P  (P  <  P)  to  jointly  estimate  the  AR  coefficients  and  model  order, 
OSCG-PAMF  does  not  contain  any  recursion.  It  only  has  to  perform  CG  with  the  disturbance 
covariance  matrix  Ry  1  for  JP  iterations  to  obtain  a  model  order  estimate. 

Remark:  Several  estimators  can  be  employed  to  find  the  linear  prediction  filters  for  the  PAMF. 
The  estimator  as  represented  by  (3.57)  along  with  the  covariance  matrix  estimates  (3.54)-(3.56) 
is  often  called  the  multichannel  Yule-Walker  method.  Other  estimators,  such  as  the  least-squares 
estimators  [16],  solve  slightly  modified  versions  of  the  linear  equation  (3.57).  It  is  noted  that  in 
most  cases,  the  CG  algorithm  can  be  used  to  efficiently  solve  such  a  modified  linear  equation. 
Due  to  space  limit,  we  will  not  explore  these  alternative  CG-PAMF  detectors. 

A  similar  comparison  can  be  made  between  the  complexity  of  the  OSCG-PAMF  detector  and 
that  of  the  eigencanceler  when  the  covariance  matrix  is  unknown.  In  addition  to  the  numbers 
of  flops  as  summarized  in  Section  3.4.3,  both  have  to  pay  the  extra  complexity  needed  to  esti¬ 
mate  the  covariance  matrix.  In  this  case,  the  OSCG-PAMF  requires  an  additional  complexity 
of  0(J2PNK)  as  incurred  in  (3.54)-(3.56),  whereas  the  extra  complexity  for  the  eigencanceler 
is  0(J2N2K )  that  is  used  to  estimate  a  full  (JN  x  JN)  space-time  covariance  matrix  from  K 
training  signals. 


(3.55) 


(3.56) 
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3.6  Numerical  Results 


In  this  section,  simulation  results  are  provided  to  illustrate  the  performance  of  the  proposed  tech¬ 
niques.  We  consider  simulated  data  generated  using  AR  models  and  the  KASSPER  data  [20], 
which  were  obtained  from  more  realistic  clutter  models.  The  simulation  results  presented  be¬ 
low  use  20000  independent  Monte  Carlo  data  realizations  and  a  probability  of  false  alarm  of 
Pfa  =  UP2.  The  chosen  Pfa  may  be  considered  too  high  for  many  practical  detection  applica¬ 
tions.  It  is  noted  that  the  choice  is  only  to  facilitate  computer  simulation  and  reduce  simulation 
time.  The  main  observations  from  the  simulation,  including  the  convergence  of  the  CG  algorithm 
in  PAMF  detection  and  the  accuracy  of  the  estimated  P  provided  by  the  proposed  model  order 
selection  method,  are  independent  of  the  choice  of  Pfa. 

A  major  issue  that  we  like  to  illustrate  in  the  following  numerical  examples  is  the  convergence 
of  the  CG  algorithm,  with  partial  or  full  iterations,  when  the  data  covariance  matrix  is  known  or 
estimated,  and/or  when  the  AR  model  order  is  known  or  estimated.  To  this  end,  we  compare 
the  various  detectors  used  with  the  CG  algorithm,  including  the  CG-PMF  (Section  3.4.2),  CG- 
PAMF  with  a  known  AR  model  order  (Section  3.5)  and  OSCG-PAMF  with  an  estimated  model 
order  (Section  3.5),  with  the  same  detectors  used  with  direct  matrix  inverse  (DMI).  For  example, 
the  DMI-PAMF  detector  involves  a  direct  inverse  of  the  estimated  covariance  matrix  in  (3.54) 
and  uses  it  compute  the  linear  prediction  filter  (3.28).  This  DMI  approach  turns  out  to  coincide 
with  the  Yule-Walker  method  [26]  for  AR  spectral  estimation.  It  is  noted  in  [16]  that  there  are 
alternative  spectral  estimation  methods  which  may  yield  better  detection  performance  in  some 
scenarios.  These  alternatives  are  not  considered  here  since  the  focus  is  the  convergence  of  the 
CG  algorithm  in  PAMF.  In  the  following,  we  will  primarily  use,  as  a  comparison  metric,  the 
probability  of  detection  versus  the  SINR  for  a  given  probability  of  false  alarm.  The  output  SINR 
of  the  PAMF  detector  was  derived  and  extensively  studied  in  [30]. 

First,  we  examine  the  performance  of  the  two  implementations  of  the  PMF  detector  by  using 
simulated  data  with  AR  disturbances.  The  disturbance  is  an  AR(2)  process  with  J  =  4  elements 
and  N  =  64  pulses.  Both  PMF  detectors  have  knowledge  of  the  exact  disturbance  covariance 
matrix;  however,  they  use  different  approaches  to  compute  the  linear  predictor.  Specifically,  we 
consider  the  DMI-PMF,  which  uses  direct  matrix  inverse  to  solve  the  Wiener-Hopf  equation, 
and  the  CG-PMF  as  discussed  in  Section  3.4.2  with  knowledge  of  the  AR  model  order  P.  The 
numerical  results  are  shown  in  Fig.  3.3.  It  is  seen  that  both  implementations  yield  an  identical 
detection  performance. 

We  next  examine  the  performance  of  the  CG-based  AR  model  order  selection  method  used 
in  the  CG-PMF  and  CG-PAMF  detectors  with  an  unknown  AR  model  order.  Two  AR  distur¬ 
bance  signals  with  J  =  4  and  iV  =  64  are  considered,  and  their  model  orders  are  P  =  1  and 
3,  respectively.  We  choose  the  same  upper  bound  P  =  6  for  both  cases.  The  residual  ek  (3.36) 
is  computed  and  used  for  model  order  selection;  as  a  benchmark,  we  also  include  ek,  which  is 
similarly  computed  as  in  (3.36)  but  with  replaced  by  ek  j.  Recall  that  ek  is  an  approximation 
of  Cfc,  which  cannot  be  computed  in  practice  due  to  the  fact  that  the  true  Wiener-Hopf  solution 
is  unknown.  The  numerical  results  for  CG-PMF  are  shown  in  Fig.  3.4  and  Fig.  3.5,  which  cor¬ 
respond  to  P  =  1  and  P  =  3,  respectively.  It  is  seen  that,  ek  has  a  sharp  decrease  at  the  JP-th 
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J=4,  N=64,  Pf=1 .000000e-002,  AR(2) 


-5  0  5  10  15 

SINR  (dB) 

Figure  3.3:  Probability  of  detection  versus  SINR  of  PMF  for  simulated  data(J  =  4;  N  =  64; 
P  =  2) 

(JP  =  4  for  P  =  1  and  JP  =  12  for  P  =  3)  iteration  of  CG,  which  confirms  the  effectiveness 
of  the  CG-based  model  order  selection  method.  The  counterpart  model  order  selection  results 
for  CG-PAMF  are  shown  in  Fig.  3.6  and  Fig.  3.7,  for  P  =  1  and  P  =  3,  respectively.  Unlike 
the  CG-PMF  which  uses  the  real  disturbance  covariance  matrix,  the  sample  covariance  matrix 
estimated  from  the  training  data  (cf.  (3.42))  is  employed  for  model  order  selection  in  CG-PAMF. 
Here  the  training  data  size  is  set  to  K  =  32.  It  is  also  shown  that  ej  has  a  sharp  decrease  at  the 
JP-th  (JP  =  4  for  P  =  1  and  JP  =  12  for  P  =  3)  iteration  of  CG,  although  the  decrease  in 
residue  is  smaller  than  that  of  CG-PMF  due  to  estimation  error  of  the  sample  covariance  matrix. 

We  now  consider  the  convergence  of  PCG-PMF.  The  simulated  disturbance  is  an  AR(8)  mul¬ 
tichannel  process  with  J  =  4.  The  convergence  of  CG-PMF  and  PCG-PMF  is  shown  in  Fig.  3.8. 
The  condition  number  of  the  preconditioned  covariance  matrix  is  4.2,  which  is  much  less  than 
the  condition  number  of  the  original  covariance  matrix  77.1.  It  is  seen  from  Fig.  3.8  that  only 
5  iterations  are  needed  in  PCG-PMF  to  reach  a  relative  approximation  error  under  1%,  while  20 
iterations  are  needed  for  CG-PMF. 

Our  next  example  considers  the  adaptive  PAMF  detector,  for  which  the  disturbance  covari¬ 
ance  matrix  is  unknown  and  the  sample  covariance  matrix  is  estimated  by  (3.54).  Similar  to 
the  PMF  detector,  we  compare  two  implementations  of  the  PAMF  detector,  including  the  DMI- 
PAMF  and  CG-PAMF,  Here  the  DMI-PAMF  directly  inverses  the  sample  covariance  matrix  to 
get  the  maximum-likelihood  estimation  of  AR  coefficients  [26].  The  disturbance  is  an  AR(2) 
signal,  whose  disturbance  covariance  matrix  is  estimated  from  K  =  16  target-free  training  data 
vectors,  and  the  AR  coefficients  are  estimated  based  on  the  estimated  disturbance  covariance  ma¬ 
trix.  The  numerical  results  are  shown  in  Fig.  3.9.  It  is  observed  that  both  implementations  yield 
an  identical  detection  performance. 

The  performance  of  the  CG-PAMF  with  an  unknown  disturbance  AR  model  order  and  dis- 
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Figure  3.4:  Residuals  ejp  and  ijp  for  model  order  selection  in  CG-PMF  ( J  —  4;  N  —  64;  P  —  1; 
P  =  6) 

J=4,  N=64,  P  =6,  AR(3) 

max  '  ' 
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Figure  3.5:  Residuals  eJp  and  eJp  for  model  order  selection  in  CG-PMF  (J  =  4;  N  =  64;  P  =  3; 
P  =  6) 


turbance  covariance  matrix  is  considered  next.  Both  AR  model  based  data  and  KASSPER  2002 
data  set  are  employed  in  this  example.  The  KASSPER  data  set  was  generated  by  considering 
practical  airborne  radar  parameters  and  issues  found  in  a  real-world  clutter  environment  [20]. 
Specifically,  the  simulated  airborne  radar  platform  travels  at  a  speed  of  100  m/s  with  a  3°  crab 
angle.  The  radar  carrier  frequency  is  1240  MHz.  The  horizontal  11  antenna  elements  form  a 
ULA  with  a  spacing  of  0.1092m  between  adjacent  elements,  and  the  transmit  array  is  uniformly 
weighted  and  phased  to  steer  the  main  beam  to  195°.  The  pulse  repetition  frequency  is  1984  Hz 
and  a  coherent  processing  interval  contains  32  pulses.  Only  the  first  8  elements  are  used  in  our 
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K=32,  J=4,  N=64,  P  =6,  AR(1 ) 
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Number  of  iterations  (k) 

Figure  3.6:  Residuals  ejp  and  ijp  for  model  order  selection  in  CG-PAMF  (K  =  32;  J  —  4; 
N  =  64;  P  =  1;  P  =  6) 


K=32,  J=4,  N=64,  P  =6,  AR(3) 
max  '  ' 


Number  of  iterations  (k) 

Figure  3.7:  Residuals  ejp  and  ijp  for  model  order  selection  in  CG-PAMF  (K  =  32;  J  =  4; 
N^=  64;  P  =  3;  P  =  6) 

simulation.  We  use  the  covariance  matrix  associated  with  range  bin  200  in  the  KASSPER  data  set 
to  generate  the  test  data  and  the  covariance  matrices  from  the  neighboring  ranges  bins  to  generate 
the  training  signals.  A  target  is  injected  into  the  test  cell  with  a  normalized  spatial  frequency  0.1 
and  a  normalized  Doppler  frequency  0.35. 

The  numerical  results  are  shown  in  Fig.  3.10  for  the  AR  model  based  data  and,  respectively, 
Fig.  3.11  for  the  KASSPER  2000  data,  where  OSCG-PAMF  (unknown  P)  represents  the  CG- 
PAMF  detector  with  the  CG-based  model  order  selection  method  which  employs  a  model  order 


38 


Figure  3.8:  Convergence  of  CG-PMF  and  PCG-PMF  (J  =  4;  P  =  8) 


K=1 6,  J=4,  N=64,  P=1 .000000e-002,  AR(2) 


Figure  3.9:  Probability  of  detection  versus  SINR  of  PAMF  for  AR  data  ( K  =  16;  J  =  A;  N  =  64; 
P  =  2) 

upper  bound  P  calculated  by  (3.35).  In  Fig.  3.11,  we  also  include  for  comparison  the  joint 
domain  localized  (JDL)  detector  [3 1],  a  popular  reduced-dimensional  STAP  solution  in  scenarios 
of  limited  training.  The  JDL  is  implemented  by  using  3  beams  and  3  Doppler  bins  for  adaptivity. 
It  is  seen  that  the  performance  of  the  OSCG-PAMF  is  nearly  identical  to  that  of  CG-PAMF  with 
known  P  (AR  data)  or  a  pre-selected  P  =  2  (KASSPER  data).  For  the  case  of  AR  data,  we 
noticed  that  only  one  model  order  selection  error  (P  ^  P)  occurred  out  of  20000  simulations. 
Moreover,  using  the  relevant  parameters  of  the  KASSPER  data,  we  have  (3  =  2 vgTr/d  =  0.923. 
It  follows  that  for  J  =  8  elements,  the  maximum  number  of  conjugate-gradient  iterations  needed 
by  the  CG-PAMF  for  a  given  model  order  p  is  estimated  to  be  rcp  +  1  =  [8.077  +  0.923p].  For 
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Figure  3.10:  Probability  of  detection  versus  SINR  for  AR  data  (K  =  32;  J  =  4;  N  =  64;  P  =  2; 
P  =  6) 

example,  the  maximum  numbers  of  CG  iterations  for  p  —  2  is  about  10  due  to  the  low-rank 
structure  of  the  clutter,  whereas  without  such  a  structure,  it  would  require  p  J  =  16  iterations  for 
the  CG  to  converge.  It  is  also  seen  in  Fig.  3.11  that  the  PAMF  detectors  outperform  the  JDL-AMF 
detector.  The  JDL-AML  experiences  a  loss  of  about  4  dB  compared  with  the  MF. 

Finally,  we  compare  the  complexity  in  terms  of  the  number  of  flops  required  by  the  CG  and 
DMI  implementations.  The  flops  required  by  the  CG-PAMF  and  DMI-PAMF  versus  the  AR 
model  order  p  are  shown  in  Fig.  3.12.  For  the  DMI-PAMF,  the  QR  decomposition  is  adopted  to 
get  the  ./-channel  AR  coefficients.  It  is  seen  that  the  complexity  of  the  CG-PAMF  is  lower  than 
that  of  the  DMI-PAMF. 

3.7  Concluding  Remarks 

The  conjugate-gradient  (CG)  algorithm  was  employed  to  solve  the  linear  prediction  problem  un¬ 
derlying  the  parametric  matched  filter  (PMF)  and  parametric  adaptive  matched  filter  (PAMF) 
detectors.  It  is  shown  that  the  CG  algorithm  leads  to  not  only  new  efficient  implementations,  but 
also  new  insights  of  these  parametric  detectors  as  reduced-dimensional  subspace  detectors.  In 
particular,  the  linear  prediction  filter  and  whitening  filter  of  the  PMF  and  PAMF  detectors  are 
within  the  Krylov  subspace  of  dimension  JP,  and  these  detectors  are  reduced  .//'-dimensional 
subspace  detectors,  where  J  and  P  are  the  number  of  channels  and  AR  model  order,  respec¬ 
tively.  We  examined  the  convergence  rate  of  the  CG  parametric  detectors.  In  airborne  radar 
applications,  the  special  low-rank  structure  of  the  disturbance  covariance  matrix  implies  that  a 
rapid  convergence  is  possible,  whereby  convergence  can  be  achieved  without  completing  a  full 
round  of  CG  iterations.  Even  for  disturbance  covariance  matrices  that  do  not  have  the  low-rank 
structure,  preconditioning  methods  can  be  used  to  speed  up  the  convergence  rate.  In  general,  the 
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Figure  3.11:  Probability  of  detection  versus  SINR  for  KASSPER  2002  data  ( K  =  32 ;  J  =  8; 
N  =  32;  P  =  2;  P  =  2) 


Figure  3.12:  Computational  complexity  of  CG-PAMF  and  DMI-PAMF  versus  AR  model  order  p 

(K  =  32-J  =  8;N  =  32). 

CG  parametric  detectors  are  more  efficient  than  their  counterparts  implemented  in  conventional 
approaches.  We  also  presented  a  new  CG-based  AR  model  order  selection  method,  which  is  nat¬ 
urally  integrated  with  the  CG  iterations.  The  proposed  techniques  are  illustrated  by  using  both 
KASSPER  and  other  simulated  data. 

Finally,  we  note  that  the  CG  algorithm  bears  some  similarity  to  a  vector  space  approach  [32] 
to  solving  the  multi-dimensional  Yule -Walker  equation  for  an  arbitrary  region  of  support.  Both 
involve  the  use  of  conjugate  orthogonal  basis  vectors.  A  future  subject  would  be  to  investigate 
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the  relation  of  the  two  approaches  and  explore  the  application  of  the  CG  algorithm  for  multi¬ 
dimensional  and  multichannel  applications. 


3.8  Appendix:  Proof  of  Lemma  1 

It  is  well  known  that  the  conjugate  direction  vectors  obtained  by  the  CG  algorithm  solving  the 
Wiener-Hopf  equation  (3.30)  span  the  Krylov  subspace  [10]: 


/C(Rf Rf\  k )  =  span  \  BfJ,  •  •  •  ,  D£ 


d-P)  iv(P) 


(3.58) 


Furthermore,  the  truncated  solution  obtained  at  the  k- th  iteration  Bj.P)  minimizes  the  Ry  '-norm 
of  the  approximation  error  over  all  vectors  on  /C(Rj.  RyP\  k )  [33],  i.e. 


■p>(-P) _ p>(P) 

j  Rf) 


min  x  —  ^  a/,:Ry  ,)  * 


(3.59) 


(  P)  ( P ) 

Therefore,  the  truncated  solution  obtained  at  the  k- th  iteration  B).  ’  is  the  Ry  -orthogonal  pro¬ 
jection  of  the  Wiener  solution  BP  1  to  the  subspace  /C(R.;,  RyP),  k),  and 


^ k,j 


(P) 

cy  -i  •  i 

uij  ’ 


c2 ,j  >  >  u k,j 


(3.60) 


contains  the  coordinate  values  of  conjugate-direction  vectors  |d^P\  D^,  •  •  •  ,  D^j,  which 
are  given  by 

(P)  Dg^RpB^ 

a)  ’  =  —4? - ^ (3 

d£»»<»d«5 

With  the  definition  of  D k,j  by  (3.32),  we  can  write  after  JP  iterations 


<*JPJ  =  Djp.Bf ) 


(3.62) 


where  BJPJ  =  [Dij,  D 2j,  •  •  •  ,  DJPj].  Recalling  B^P)  =  [Bj,  0j(p_p)xl]'i',  we  have 


aJP,j  ~  Pjpj 


(3.63) 


where  DjPj  e  CJPxJP  is  the  upper  JP  x  JP  block  matrix  of  Bjpj,  and  DJPd  contains  the 
lower  block  of  D./pj.  Then  B;  is  given  by 


B?  —  DjPjQjPj. 


(3.64) 
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The  intermediate  solution  obtained  at  the  JP-th  CG  iteration  is 

jp 

B<>%  =  y  =  0%aJPJ.  (3.65) 

m=  1 

It  follows  from  (3.64)  and  (3.65)  that  B {fpj  and  By  are  related  by 

B%  =  =  WjpjBj  (3.66) 

where  W JPj  =  Djp^Djpj  e  CJPxJP,  which  completes  the  proof. 
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